ESC
Type to search guides, tutorials, and reference documentation.
Verified by Garnet Grid

Caching Strategies: Serving Data at the Speed of Memory

Implement caching that reduces latency from hundreds of milliseconds to single-digit milliseconds. Covers cache placement, invalidation strategies, cache-aside vs write-through patterns, distributed caching with Redis, CDN caching, and the pitfalls that turn your cache into a source of stale data and subtle bugs.

Caching is the most powerful performance optimization available to you — and the most dangerous. A well-placed cache reduces database queries by 90%, cuts response times from 200ms to 5ms, and handles traffic spikes that would otherwise overwhelm your backend. A poorly implemented cache serves stale data, creates consistency bugs that are nearly impossible to reproduce, and adds a new failure mode to every request path.

The hard part of caching is not storing data. It is invalidating it.


Cache Placement

Where to cache (from closest to user to closest to data):

  User           CDN          App Server      Cache         Database
   │              │              │              │              │
   │── Request ──→│              │              │              │
   │              │── Cache hit? │              │              │
   │              │   Yes → 200  │              │              │
   │              │   No ────────→── Cache hit? │              │
   │              │              │   Yes → data │              │
   │              │              │   No ────────→── Query ────→│
   │              │              │              │←── Result ───│
   │              │              │←── Store+Return             │
   │←── Response ─│←── store ───│              │              │

  Latency at each layer:
  CDN cache hit:     1-20ms    (edge, near user)
  App cache hit:     1-5ms     (Redis/Memcached, in datacenter)
  Database query:    10-500ms  (disk I/O, query execution)
Cache LocationLatencyBest For
Browser cache0msStatic assets, user-specific data
CDN1-20msStatic content, public API responses
Application cache (Redis)1-5msDatabase query results, computed values
Database cache5-20msQuery plan cache, buffer pool

Caching Patterns

Cache-Aside (Lazy Loading)

# Application manages cache reads and writes
def get_user(user_id: str) -> User:
    # 1. Check cache first
    cached = redis.get(f"user:{user_id}")
    if cached:
        return User.from_json(cached)

    # 2. Cache miss: query database
    user = db.query("SELECT * FROM users WHERE id = %s", user_id)

    # 3. Store in cache for next time
    redis.setex(f"user:{user_id}", 3600, user.to_json())  # TTL: 1 hour

    return user

def update_user(user_id: str, data: dict):
    # Update database
    db.execute("UPDATE users SET ... WHERE id = %s", user_id)

    # Invalidate cache (delete, not update)
    redis.delete(f"user:{user_id}")
    # Next read will populate cache with fresh data

Write-Through

# Cache is always updated synchronously with database
def update_user(user_id: str, data: dict):
    # Update database
    db.execute("UPDATE users SET ... WHERE id = %s", user_id)

    # Update cache immediately
    user = db.query("SELECT * FROM users WHERE id = %s", user_id)
    redis.setex(f"user:{user_id}", 3600, user.to_json())

    # Cache is always consistent with database
    # Higher write latency (cache + DB on every write)

Pattern Comparison

PatternRead SpeedWrite SpeedConsistencyComplexity
Cache-asideFast (after first miss)Fast (invalidate only)EventualLow
Write-throughFastSlower (cache + DB)StrongMedium
Write-behindFastFast (async write)EventualHigh
Read-throughFastN/AEventualMedium

Cache Invalidation

The two hard problems in computer science:
  1. Cache invalidation
  2. Naming things
  3. Off-by-one errors

Invalidation strategies:

  TTL (Time-to-Live):
    "Data is valid for 1 hour, then stale"
    Simple, but data can be stale for up to TTL duration

  Event-driven:
    "When data changes, delete the cache entry"
    Accurate, but requires events for every write path

  Versioned keys:
    "Cache key includes version: user:123:v5"
    New version = new key = automatic invalidation

  Purge on deploy:
    "Flush relevant caches when deploying new code"
    Useful for config/schema cache, not for data

Redis Best Practices

# Key naming convention
# {entity}:{id}:{optional_field}
"user:12345"                  # Full user object
"user:12345:preferences"      # Just preferences
"product:sku_abc:price"       # Product price
"session:token_xyz"           # Session data
"rate_limit:user:12345"       # Rate limiting counter

# TTL guidelines:
# User profile:     1 hour    (changes infrequently)
# Product listing:  15 min    (prices may change)
# Session data:     24 hours  (or until logout)
# Rate limit:       1 minute  (sliding window)
# Feature flags:    5 minutes (allow quick changes)

Cache Stampede Prevention

# Problem: cache expires → 100 concurrent requests all query DB simultaneously
# Solution: lock-based refresh

def get_user_safe(user_id: str) -> User:
    cached = redis.get(f"user:{user_id}")
    if cached:
        return User.from_json(cached)

    # Try to acquire lock (only one request refreshes cache)
    lock = redis.set(f"lock:user:{user_id}", "1", nx=True, ex=10)
    if lock:
        # Won the lock: refresh cache
        user = db.query("SELECT * FROM users WHERE id = %s", user_id)
        redis.setex(f"user:{user_id}", 3600, user.to_json())
        redis.delete(f"lock:user:{user_id}")
        return user
    else:
        # Another request is refreshing: wait briefly and retry
        time.sleep(0.1)
        return get_user_safe(user_id)

Anti-Patterns

Anti-PatternProblemFix
Cache everythingMemory bloat, stale dataCache hot data only (80/20 rule)
No TTLData never expires, grows foreverAlways set TTL, even if long
Update cache on writeRace conditions between write + cacheDelete cache on write (cache-aside)
Cache without monitoringSilent failures, stale dataTrack hit rate, miss rate, evictions
Caching error responsesError cached → repeated failuresNever cache 5xx or error states

Implementation Checklist

  • Identify the top 5 slowest queries and cache their results
  • Use cache-aside pattern as default (read cache, miss → query → store)
  • Invalidate by deleting cache keys on write, not by updating
  • Set TTL on every cache entry — never cache without expiration
  • Use consistent key naming: {entity}:{id}:{field}
  • Prevent cache stampede with distributed locks on refresh
  • Monitor cache hit rate (target > 90% for hot data)
  • Never cache error responses or null results
  • Use CDN caching for static assets and public API responses
  • Load test with cold cache to ensure the application handles cache misses gracefully
Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →