Verified by Garnet Grid

API Rate Limiting & Throttling

Protect APIs with rate limiting. Covers token bucket, sliding window, distributed rate limiting with Redis, client-specific limits, and graceful degradation under load.

Rate limiting prevents any single client from monopolizing your API. Without it, one misbehaving script can consume all your server capacity, effectively DDoS-ing your own service for everyone else. Good rate limiting is invisible to normal users and transparent to developers who hit limits.


Algorithms

AlgorithmHow It WorksProsCons
Token bucketTokens added at fixed rate, consumed per requestAllows bursts, smoothSlightly complex
Sliding windowCount requests in rolling time windowAccurate, predictableMemory per window
Fixed windowCount requests in fixed intervals (per minute)SimpleBurst at window boundaries
Leaky bucketRequests queued, processed at fixed rateSmooth outputQueuing adds latency

Implementation with Redis

import redis
import time

r = redis.Redis()

def rate_limit(client_id, max_requests=100, window_seconds=60):
    """Sliding window rate limiter with Redis."""
    key = f"ratelimit:{client_id}"
    now = time.time()
    window_start = now - window_seconds
    
    pipe = r.pipeline()
    
    # Remove old entries outside the window
    pipe.zremrangebyscore(key, 0, window_start)
    
    # Count requests in current window
    pipe.zcard(key)
    
    # Add current request
    pipe.zadd(key, {str(now): now})
    
    # Set expiry on the key
    pipe.expire(key, window_seconds)
    
    results = pipe.execute()
    request_count = results[1]
    
    if request_count >= max_requests:
        return {
            "allowed": False,
            "retry_after": window_seconds - (now - window_start),
            "limit": max_requests,
            "remaining": 0,
        }
    
    return {
        "allowed": True,
        "limit": max_requests,
        "remaining": max_requests - request_count - 1,
    }

Response Headers

HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 47
X-RateLimit-Reset: 1709510400

HTTP/1.1 429 Too Many Requests
Retry-After: 30
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1709510400
Content-Type: application/json

{
  "error": "rate_limit_exceeded",
  "message": "Rate limit of 100 requests per minute exceeded",
  "retry_after_seconds": 30
}

Tiered Rate Limits

TierRequests/minBurstUse Case
Free6010Public API, evaluation
Basic600100Small applications
Pro6,0001,000Production workloads
Enterprise60,00010,000High-volume integrations
InternalUnlimitedN/AInternal services

Anti-Patterns

Anti-PatternProblemFix
No rate limitingOne client can exhaust resourcesRate limit all endpoints
Same limit for all endpointsExpensive endpoints treated same as cheapPer-endpoint limits based on cost
No burst allowanceLegitimate short bursts blockedToken bucket with burst capacity
Vague 429 responsesClients don’t know when to retryInclude Retry-After header and remaining count
Rate limit by IP onlyShared IPs (offices, NATs) hit limitsRate limit by API key or user ID

Checklist

  • Rate limiting on all public API endpoints
  • Algorithm: token bucket or sliding window
  • Distributed: Redis-backed for multi-instance APIs
  • Tiered limits by plan/subscription level
  • Response headers: Limit, Remaining, Reset
  • 429 response with Retry-After and clear error message
  • Per-endpoint limits for expensive operations
  • Monitoring: rate limit hits, top consumers
  • Graceful degradation: degrade non-critical features first

:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For API design consulting, visit garnetgrid.com. :::

Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →