Caching Strategies

Cache Aside vs Write-Through vs Write-Behind

Strategy	How It Works	Pros	Cons	When to Use
Cache Aside (Lazy Loading)	App checks cache first; on miss, loads from DB and populates cache	Simple; cache contains only requested data; resilient to cache failures	Initial request is slow (cache miss); stale data possible	Read-heavy with unpredictable access patterns (user profiles, product pages)
Write-Through	App writes to cache and DB synchronously	Cache always consistent with DB; read hits are fast	Write latency (two writes); wasted cache space for unread data	Read-heavy with critical consistency (inventory, pricing)
Write-Behind (Write-Back)	App writes to cache; cache asynchronously writes to DB	Fast writes; reduced DB load; batch writes possible	Risk of data loss if cache fails; eventual consistency	Write-heavy with acceptable data loss risk (analytics events, logs)
Refresh-Ahead	Cache proactively refreshes before expiration	Reduced latency; no cache miss penalty	Wasted resources if data not accessed; complex to implement	Predictable access patterns (homepage, trending content)

Why Choose Each:

Cache Aside: Most flexible and common pattern; works when you don't know what to cache upfront
Write-Through: Need strong consistency between cache and DB
Write-Behind: Extremely high write throughput needed, can tolerate data loss
Refresh-Ahead: Predictable access patterns, zero cache miss tolerance

Tradeoff Summary:

Cache Aside: Flexibility + Resilience ↔ Cache misses, stale data
Write-Through: Consistency ↔ Write latency
Write-Behind: Write performance ↔ Data loss risk, complexity

Caching Strategies Deep Dive

Cache Aside (Lazy Loading) Pattern

Click to view code

GET request → Check Cache → Hit → Return
                         ↓
                       Miss → Query DB → Store in Cache → Return

Pros:

Simplest to implement; no special cache setup
Cache only contains accessed data (no waste)
Resilient: DB failures don't block (stale cache still works)
Great for unpredictable access patterns

Cons:

First request always slow (cache miss)
Stale data possible (no cache coherency)
"Cache Stampede": Multiple requests for same miss key → multiple DB hits

Best for: User profiles, product pages, search results

Implementation (Redis + Python):

Click to view code (python)

def get_user(user_id):
    # Check cache
    cached = redis.get(f"user:{user_id}")
    if cached:
        return json.loads(cached)
    
    # Cache miss: load from DB
    user = db.query(f"SELECT * FROM users WHERE id={user_id}")
    
    # Store in cache (1-hour TTL)
    redis.setex(f"user:{user_id}", 3600, json.dumps(user))
    
    return user

Write-Through Pattern

Click to view code

SET request → Write to Cache → Write to DB → Return
              (synchronous)

Pros:

Cache always consistent with DB
No stale data
Read hits are ultra-fast (cache warmed)

Cons:

Write latency doubled (two writes sequentially)
Cache fills with unread data (memory waste)
If cache fails, requests block until DB responds

Best for: Inventory, pricing, financial data (consistency > latency)

Implementation (Redis + Python):

Click to view code (python)

def update_product_price(product_id, new_price):
    # Write to cache first
    redis.set(f"product:{product_id}:price", new_price)
    
    # Then write to DB
    db.execute(f"UPDATE products SET price={new_price} WHERE id={product_id}")
    
    return {"status": "success"}

Tradeoff: Users wait ~10-50ms extra per write, but reads are instant.

Write-Behind (Write-Back) Pattern

Click to view code

SET request → Write to Cache → Return (asynchronously write to DB)
                              ↓
                     Background job syncs to DB

Pros:

Extremely fast writes (cache write only)
Reduced DB load (batch writes, deduplication)
Can reorder/optimize writes

Cons:

Cache failure = data loss
Consistency lag (DB lags cache by seconds/minutes)
Requires reliable queue/persistence (RDB, AOF)

Best for: Analytics events, logs, non-critical metrics

Implementation (Redis + async job):

Click to view code (python)

# Fast write to cache
def log_event(user_id, event_type):
    redis.lpush(f"events:{user_id}", json.dumps({
        "type": event_type,
        "timestamp": time.time()
    }))
    return {"status": "queued"}

# Background worker (runs every 10 seconds)
def flush_events_to_db():
    for user_id in redis.keys("events:*"):
        events = redis.lrange(f"events:{user_id}", 0, -1)
        db.batch_insert("events", events)
        redis.delete(f"events:{user_id}")

Risk: If Redis crashes before flush, events are lost. Use RDB/AOF to mitigate.

Refresh-Ahead Pattern

Click to view code

User requests data → Cache always has fresh copy (proactively refreshed)
                   ↓
            Background job refreshes before expiry

Pros:

Zero cache miss latency (always fresh data)
Predictable performance
Ideal for high-traffic pages

Cons:

Wasted computation if data not accessed
Complex implementation (need scheduler)
Hard to predict optimal refresh interval

Best for: Homepage, trending content, dashboards

Implementation:

Click to view code (python)

# Refresh-Ahead for trending articles
def refresh_trending_articles():
    articles = db.query("SELECT * FROM articles ORDER BY views DESC LIMIT 10")
    redis.set("trending:articles", json.dumps(articles))
    redis.expire("trending:articles", 300)  # 5-minute TTL

# Schedule every 4.5 minutes (refresh before 5-minute expiry)
schedule.every(4.5).minutes.do(refresh_trending_articles)

# On read: always fast
def get_trending():
    return redis.get("trending:articles")  # Guaranteed hit

Cache Invalidation Strategies

TTL-Based (Simple)

Click to view code

SET key value EX 3600  # Expires in 1 hour

Pro: Simple, no coordination Con: Stale data for up to TTL duration

Event-Based (Accurate)

Click to view code

On data update:
  1. Update database
  2. Publish "user:123:updated" event
  3. Cache subscribers delete key

Pro: Instant invalidation, no stale data Con: Complex, requires pub/sub

Hybrid (Best Practice)

Click to view code

TTL + Event-based:
  1. Set 1-hour TTL (safety net)
  2. On write, publish invalidation event (immediate)
  3. If event missed, TTL catches it eventually

Interview Questions & Answers

Q1: Design a caching strategy for Uber's surge pricing. What pattern would you use?

Answer: Use Write-Through because:

Consistency critical: incorrect pricing loses money
Write frequency: moderate (updated every 5-10 minutes by algorithm)
Read frequency: extremely high (every ride request)

Implementation:

Click to view code

Surge pricing update:
  1. Algorithm calculates new price_multiplier for zone
  2. Write to cache: SET surge:zone:123 1.5x
  3. Write to database: UPDATE surge_pricing SET multiplier=1.5
  4. Return confirmation

Rider requests ride:
  1. GET surge:zone:123 → Returns 1.5x instantly (cache hit)
  2. Zero latency, always consistent with DB

Why not Write-Behind? If cache loses surge pricing, system would quote wrong prices, losing revenue.

Q2: You're designing a Twitter-like feed cache. The feed is read-heavy but updates aren't critical. Which pattern?

Answer: Use Cache Aside because:

Unpredictable access: users request different feeds (friends vary)
Stale data acceptable: tweets 5 minutes old are fine
Simple implementation: no write coordination needed

Implementation:

Click to view code (python)

def get_home_feed(user_id):
    cache_key = f"feed:{user_id}"
    
    # Check cache first
    feed = redis.get(cache_key)
    if feed:
        return json.loads(feed)
    
    # Cache miss: query DB
    feed = db.query("""
        SELECT tweets FROM tweets 
        WHERE author_id IN (SELECT following_id FROM follows WHERE follower_id = ?)
        ORDER BY created_at DESC LIMIT 50
    """, user_id)
    
    # Store for 10 minutes
    redis.setex(cache_key, 600, json.dumps(feed))
    return feed

Cache Stampede Protection:

Click to view code (python)

def get_feed_with_lock(user_id):
    cache_key = f"feed:{user_id}"
    feed = redis.get(cache_key)
    if feed:
        return json.loads(feed)
    
    # Use lock to prevent multiple DB queries
    lock_key = f"feed:{user_id}:lock"
    if redis.set(lock_key, "1", NX=True, EX=5):
        # Got lock, fetch from DB
        feed = fetch_from_db(user_id)
        redis.setex(cache_key, 600, json.dumps(feed))
    else:
        # Wait for lock holder to populate cache
        time.sleep(0.1)
        feed = redis.get(cache_key)
    
    return json.loads(feed)

Q3: Your system writes 1 million events/second. Which caching strategy handles this?

Answer: Use Write-Behind because:

Write latency matters: 1M events/sec = can't afford 2 writes (Write-Through)
Eventual consistency acceptable: logs/events can lag
Batch efficiency: Collect 100K events, write once = 10x throughput

Architecture:

Click to view code

Events → Redis queue → Batch processor → PostgreSQL
  ↓
  1M writes/sec (Cache side only, instant)
  ↓
  Background job every 100ms
  ↓
  10K-100K events/batch → PostgreSQL (optimized bulk insert)

Implementation:

Click to view code (python)

# Fast write (cache only)
def log_event(event):
    redis.lpush("events:pending", json.dumps(event))
    # Optional: if queue > threshold, trigger flush
    if redis.llen("events:pending") > 50000:
        flush_to_db()

# Batch flush (runs every 100ms)
def flush_to_db():
    events = redis.lrange("events:pending", 0, -1)
    if not events:
        return
    
    # Batch insert
    db.execute("""
        INSERT INTO events (user_id, type, timestamp) 
        VALUES (?, ?, ?)
    """, [(e['user_id'], e['type'], e['ts']) for e in events])
    
    redis.delete("events:pending")

Tradeoff: If Redis crashes, lose ~100ms of events. Mitigate with RDB persistence (fsync every 100ms).

Q4: When would you NOT use caching? When is it counterproductive?

Answer: Don't cache when:

Data changes constantly (> 50% update:read ratio)

- Example: Real-time stock prices, live auction bids - Cache becomes stale immediately - Better: Stream data directly

Data is tiny (< 1KB)

- Example: Single boolean flags - Cache overhead > benefit - Better: Store in application state or database

First-access is critical

- Example: Medical records (must be current) - Cache lag is dangerous - Better: No cache, strong consistency

Memory is constrained

- Example: IoT devices with 512MB RAM - Cache eviction thrashing - Better: Optimize database queries instead

Data has complex dependencies

- Example: Calculated fields depending on 5+ tables - Invalidation is complex and error-prone - Better: Compute on-demand or use materialized views

Example scenario:

Click to view code

User updates profile name → cascades to:
- Comment author names
- Post author names  
- Mention notifications
- Search indexes

Caching this = invalidation nightmare.
Better: Store only in DB, compute names on-demand

Q5: How would you choose a TTL (time-to-live) for cached data?

Answer: Formula: TTL = max(acceptablestaleness, computationcost)

Guidelines by use case:

Use Case	TTL	Reason
Homepage	1-5 minutes	Content changes frequently, load is predictable
User profile	10-30 minutes	Rarely changes, acceptable to be 10min stale
Product catalog	1 hour	Changes during business hours, not real-time
Leaderboard	5 seconds	Competitions require near-real-time accuracy
Static assets	24 hours	Never changes after upload
Auth tokens	Token lifetime	Security critical, don't extend arbitrarily
DB query result	Variable	Depends on data freshness requirement

Dynamic TTL example:

Click to view code (python)

def get_product(product_id):
    if is_flash_sale_active():
        ttl = 10  # 10 seconds (prices changing fast)
    else:
        ttl = 3600  # 1 hour (normal case)
    
    cache_key = f"product:{product_id}"
    product = redis.get(cache_key)
    if not product:
        product = db.query(f"SELECT * FROM products WHERE id={product_id}")
        redis.setex(cache_key, ttl, json.dumps(product))
    return product

Rule of thumb:

If data updates hourly → TTL = 15 minutes (cache 4 stale versions)
If data updates daily → TTL = 4 hours
If data updates on-demand → TTL = 0 (cache-aside with event invalidation)

01-Database-Tradeoffs

03-API-Design

02-Caching-Strategies

Caching Strategies

Cache Aside vs Write-Through vs Write-Behind

Caching Strategies Deep Dive

Cache Aside (Lazy Loading) Pattern

Write-Through Pattern

Write-Behind (Write-Back) Pattern

Refresh-Ahead Pattern

Cache Invalidation Strategies

TTL-Based (Simple)

Event-Based (Accurate)

Hybrid (Best Practice)

Interview Questions & Answers

Q1: Design a caching strategy for Uber's surge pricing. What pattern would you use?

Q2: You're designing a Twitter-like feed cache. The feed is read-heavy but updates aren't critical. Which pattern?

Q3: Your system writes 1 million events/second. Which caching strategy handles this?

Q4: When would you NOT use caching? When is it counterproductive?

Q5: How would you choose a TTL (time-to-live) for cached data?