ESC
Type to search guides, tutorials, and reference documentation.
Verified by Garnet Grid

API Gateway Patterns: The Front Door to Your Microservices

Design API gateways that handle routing, authentication, rate limiting, and observability without becoming a bottleneck or a single point of failure. Covers gateway architectures, protocol translation, request transformation, circuit breaking, and the deployment patterns that keep your gateway fast and reliable.

An API gateway is the single entry point for all client requests to your backend services. Instead of clients knowing about 20 different microservice URLs, they know one: the gateway. The gateway handles the cross-cutting concerns — authentication, rate limiting, logging, routing — that every service needs but no service should implement independently.

Done well, a gateway simplifies your architecture. Done poorly, it becomes a monolithic bottleneck that every team depends on and nobody owns.


What the Gateway Does

Without gateway:
  Mobile App ──→ User Service (port 3001)
  Mobile App ──→ Order Service (port 3002)
  Mobile App ──→ Product Service (port 3003)
  Mobile App ──→ Payment Service (port 3004)
  (Client knows 4 URLs, handles auth 4 times, 4 CORS configs)

With gateway:
  Mobile App ──→ API Gateway ──→ User Service
                              ──→ Order Service
                              ──→ Product Service
                              ──→ Payment Service
  (Client knows 1 URL, auth handled once, 1 CORS config)
ResponsibilityWithout GatewayWith Gateway
AuthenticationEach service validates tokensGateway validates once
Rate limitingEach service implementsGateway enforces globally
CORSConfigured per serviceConfigured once
LoggingInconsistent across servicesUniform request logging
TLS terminationEach service manages certsGateway terminates TLS
Protocol translationClient adapts per serviceGateway translates

Gateway Architectures

PatternDescriptionBest For
Edge gatewaySingle gateway for all trafficSimple architectures, small teams
BFF (Backend for Frontend)One gateway per client typeMobile + Web with different needs
Gateway per domainOne gateway per business domainLarge organizations, team autonomy

Backend for Frontend (BFF)

┌──────────────┐    ┌──────────────┐    ┌──────────────┐
│  Web BFF     │    │  Mobile BFF  │    │  Partner API │
│  Gateway     │    │  Gateway     │    │  Gateway     │
│              │    │              │    │              │
│ - Full data  │    │ - Compressed │    │ - Versioned  │
│ - HTML meta  │    │ - Pagination │    │ - Rate limits│
│ - SSR hints  │    │ - Small pages│    │ - API keys   │
└──────┬───────┘    └──────┬───────┘    └──────┬───────┘
       │                   │                   │
       └───────────────────┼───────────────────┘

                    ┌──────┴──────┐
                    │  Internal   │
                    │  Services   │
                    └─────────────┘

Routing Configuration

# Kong / APISIX style declarative routing
services:
  - name: user-service
    url: http://user-service:3000
    routes:
      - name: user-routes
        paths:
          - /api/v1/users
        methods:
          - GET
          - POST
          - PUT
        plugins:
          - name: rate-limiting
            config:
              minute: 100
              policy: local
          - name: jwt
            config:
              secret_is_base64: false

  - name: order-service
    url: http://order-service:3000
    routes:
      - name: order-routes
        paths:
          - /api/v1/orders
        plugins:
          - name: rate-limiting
            config:
              minute: 50
          - name: jwt
          - name: request-transformer
            config:
              add:
                headers:
                  - "X-Request-ID:$(uuid)"
                  - "X-Forwarded-For:$(client_ip)"

Rate Limiting Strategies

AlgorithmBehaviorBest For
Fixed windowN requests per minute/hourSimple, predictable
Sliding windowSmooth rate trackingEven distribution
Token bucketAllows bursts up to bucket sizeAPIs with bursty traffic
Leaky bucketConstant output rateSteady throughput
Token bucket example:
  Bucket size: 100 tokens
  Refill rate: 10 tokens/second

  Client sends 50 requests instantly → all pass (50 tokens used)
  Client sends 60 more → 50 pass, 10 rejected (bucket empty)
  After 5 seconds → 50 new tokens available

Response headers:
  X-RateLimit-Limit: 100
  X-RateLimit-Remaining: 43
  X-RateLimit-Reset: 1710500000

When exceeded:
  HTTP 429 Too Many Requests
  Retry-After: 5

Circuit Breaking

Circuit breaker states:

  CLOSED (normal operation)
  ├── Requests pass through to backend
  ├── Track failure rate
  └── If failures > threshold → switch to OPEN

  OPEN (service protected)
  ├── All requests fail fast (no backend call)
  ├── Return cached response or error
  └── After timeout → switch to HALF-OPEN

  HALF-OPEN (testing recovery)
  ├── Allow limited requests through
  ├── If success → switch to CLOSED
  └── If failure → switch back to OPEN
# Circuit breaker configuration
circuit_breaker:
  failure_threshold: 5        # Open after 5 consecutive failures
  success_threshold: 3        # Close after 3 successes in half-open
  timeout: 30s                # Time in open state before trying half-open
  failure_rate_threshold: 50  # Open if > 50% of requests fail in window
  window_size: 10             # Evaluate last 10 requests

Gateway Tools

ToolTypeBest For
KongOpen source + EnterpriseFull-featured, plugin ecosystem
APISIXOpen sourceHigh performance, declarative
AWS API GatewayManaged (AWS)Serverless, Lambda integration
EnvoyOpen source proxyService mesh, advanced routing
TraefikOpen sourceContainer-native, auto-discovery
NGINXOpen source + commercialHigh performance, flexibility

Anti-Patterns

Anti-PatternProblemFix
Business logic in gatewayGateway becomes a monolithGateway does routing/auth only
Single gateway, many teamsDeployment bottleneckBFF or domain-specific gateways
No circuit breakingBackend failure cascadesCircuit breakers + fallback responses
No rate limitingSingle client can overwhelm systemPer-client rate limits
Gateway without observabilityCannot debug routing issuesRequest logging, latency metrics

Implementation Checklist

  • Deploy a gateway as the single entry point for all external API traffic
  • Centralize authentication at the gateway — services receive verified identity
  • Implement rate limiting per client / API key with descriptive 429 responses
  • Add circuit breakers for all backend service routes
  • Log every request: method, path, status, latency, client ID
  • Add request ID header (X-Request-ID) for distributed tracing
  • Use health checks to route traffic only to healthy backend instances
  • Implement request/response transformation for protocol differences
  • Deploy gateway with zero-downtime updates (rolling deployment)
  • Monitor gateway latency — it should add < 5ms overhead per request
Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →