An API gateway is the single entry point for all client requests to your backend services. Instead of clients knowing about 20 different microservice URLs, they know one: the gateway. The gateway handles the cross-cutting concerns — authentication, rate limiting, logging, routing — that every service needs but no service should implement independently.
Done well, a gateway simplifies your architecture. Done poorly, it becomes a monolithic bottleneck that every team depends on and nobody owns.
What the Gateway Does
Without gateway:
Mobile App ──→ User Service (port 3001)
Mobile App ──→ Order Service (port 3002)
Mobile App ──→ Product Service (port 3003)
Mobile App ──→ Payment Service (port 3004)
(Client knows 4 URLs, handles auth 4 times, 4 CORS configs)
With gateway:
Mobile App ──→ API Gateway ──→ User Service
──→ Order Service
──→ Product Service
──→ Payment Service
(Client knows 1 URL, auth handled once, 1 CORS config)
| Responsibility | Without Gateway | With Gateway |
|---|
| Authentication | Each service validates tokens | Gateway validates once |
| Rate limiting | Each service implements | Gateway enforces globally |
| CORS | Configured per service | Configured once |
| Logging | Inconsistent across services | Uniform request logging |
| TLS termination | Each service manages certs | Gateway terminates TLS |
| Protocol translation | Client adapts per service | Gateway translates |
Gateway Architectures
| Pattern | Description | Best For |
|---|
| Edge gateway | Single gateway for all traffic | Simple architectures, small teams |
| BFF (Backend for Frontend) | One gateway per client type | Mobile + Web with different needs |
| Gateway per domain | One gateway per business domain | Large organizations, team autonomy |
Backend for Frontend (BFF)
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Web BFF │ │ Mobile BFF │ │ Partner API │
│ Gateway │ │ Gateway │ │ Gateway │
│ │ │ │ │ │
│ - Full data │ │ - Compressed │ │ - Versioned │
│ - HTML meta │ │ - Pagination │ │ - Rate limits│
│ - SSR hints │ │ - Small pages│ │ - API keys │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
└───────────────────┼───────────────────┘
│
┌──────┴──────┐
│ Internal │
│ Services │
└─────────────┘
Routing Configuration
# Kong / APISIX style declarative routing
services:
- name: user-service
url: http://user-service:3000
routes:
- name: user-routes
paths:
- /api/v1/users
methods:
- GET
- POST
- PUT
plugins:
- name: rate-limiting
config:
minute: 100
policy: local
- name: jwt
config:
secret_is_base64: false
- name: order-service
url: http://order-service:3000
routes:
- name: order-routes
paths:
- /api/v1/orders
plugins:
- name: rate-limiting
config:
minute: 50
- name: jwt
- name: request-transformer
config:
add:
headers:
- "X-Request-ID:$(uuid)"
- "X-Forwarded-For:$(client_ip)"
Rate Limiting Strategies
| Algorithm | Behavior | Best For |
|---|
| Fixed window | N requests per minute/hour | Simple, predictable |
| Sliding window | Smooth rate tracking | Even distribution |
| Token bucket | Allows bursts up to bucket size | APIs with bursty traffic |
| Leaky bucket | Constant output rate | Steady throughput |
Token bucket example:
Bucket size: 100 tokens
Refill rate: 10 tokens/second
Client sends 50 requests instantly → all pass (50 tokens used)
Client sends 60 more → 50 pass, 10 rejected (bucket empty)
After 5 seconds → 50 new tokens available
Response headers:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 43
X-RateLimit-Reset: 1710500000
When exceeded:
HTTP 429 Too Many Requests
Retry-After: 5
Circuit Breaking
Circuit breaker states:
CLOSED (normal operation)
├── Requests pass through to backend
├── Track failure rate
└── If failures > threshold → switch to OPEN
OPEN (service protected)
├── All requests fail fast (no backend call)
├── Return cached response or error
└── After timeout → switch to HALF-OPEN
HALF-OPEN (testing recovery)
├── Allow limited requests through
├── If success → switch to CLOSED
└── If failure → switch back to OPEN
# Circuit breaker configuration
circuit_breaker:
failure_threshold: 5 # Open after 5 consecutive failures
success_threshold: 3 # Close after 3 successes in half-open
timeout: 30s # Time in open state before trying half-open
failure_rate_threshold: 50 # Open if > 50% of requests fail in window
window_size: 10 # Evaluate last 10 requests
| Tool | Type | Best For |
|---|
| Kong | Open source + Enterprise | Full-featured, plugin ecosystem |
| APISIX | Open source | High performance, declarative |
| AWS API Gateway | Managed (AWS) | Serverless, Lambda integration |
| Envoy | Open source proxy | Service mesh, advanced routing |
| Traefik | Open source | Container-native, auto-discovery |
| NGINX | Open source + commercial | High performance, flexibility |
Anti-Patterns
| Anti-Pattern | Problem | Fix |
|---|
| Business logic in gateway | Gateway becomes a monolith | Gateway does routing/auth only |
| Single gateway, many teams | Deployment bottleneck | BFF or domain-specific gateways |
| No circuit breaking | Backend failure cascades | Circuit breakers + fallback responses |
| No rate limiting | Single client can overwhelm system | Per-client rate limits |
| Gateway without observability | Cannot debug routing issues | Request logging, latency metrics |
Implementation Checklist