Skip to content

2025-12-19

Caching Strategies: From Local Memory to Distributed Systems

A comprehensive guide to implementing caching strategies across multiple tiers, from in-memory application caches to distributed Redis clusters and CDN edge caching. Learn when to use cache-aside vs write-through patterns, how to choose between ElastiCache and MemoryDB, and how to prevent cache stampede in production.

Effective caching is a multi-level problem: the fastest layer is an in-process LRU, the next is a remote cache (Redis or Memcached), then a CDN at the edge, and each layer has different invalidation semantics, consistency guarantees, and failure modes. A Redis cluster with a 15% hit rate is doing the wrong work at the wrong level, not the wrong tool for the job. A thundering herd on a popular key expiring is not a cache problem either; it is a stampede-protection problem that any single-layer cache will have.

This guide covers the technical decisions behind a working cache strategy. It covers the multi-level hierarchy (in-process, remote, CDN), cache-aside versus write-through, the choice between ElastiCache and MemoryDB, consistent hashing for distributed scaling, and the anti-patterns (thundering herd, cache stampede, stale invalidation) that turn caching into a net loss.

Understanding Cache Patterns

Cache patterns aren’t just academic concepts. The difference between cache-aside and write-through can determine whether you get stale data complaints or slow write performance. Here’s what each pattern actually does in production.

Cache-Aside (Lazy Loading)

The application manages both cache and database directly. On read, check cache first. On miss, fetch from database and populate cache. This is the most common pattern because it’s simple and efficient.

class UserRepository {
  private redis: Redis;
  private db: Database;

  async getUser(id: string): Promise<User> {
    // Check cache first
    const cached = await this.redis.get(`user:${id}`);
    if (cached) {
      return JSON.parse(cached);
    }

    // Cache miss - fetch from database
    const user = await this.db.users.findById(id);

    // Store in cache with TTL
    await this.redis.set(
      `user:${id}`,
      JSON.stringify(user),
      'EX',
      3600 // 1 hour
    );

    return user;
  }
}

When to use cache-aside:

  • Read-heavy workloads where not all data is accessed frequently
  • Data that can tolerate slight staleness
  • You want to cache only what’s actually used

Trade-offs:

  • Initial request experiences cache miss latency
  • Risk of cache stampede on popular expired keys (we’ll fix this)
  • Efficient memory usage since only accessed data is cached

Write-Through Pattern

Every write goes to both cache and database. The cache stays synchronized with the database, and readers always get fresh data from cache.

class UserRepository {
  async updateUser(id: string, data: Partial<User>): Promise<User> {
    // Update database first
    const user = await this.db.users.update(id, data);

    // Immediately update cache
    await this.redis.set(
      `user:${id}`,
      JSON.stringify(user),
      'EX',
      3600
    );

    return user;
  }

  async getUser(id: string): Promise<User> {
    // Check cache (should always be there for recently updated users)
    const cached = await this.redis.get(`user:${id}`);
    if (cached) {
      return JSON.parse(cached);
    }

    // Fallback to cache-aside for cache miss
    const user = await this.db.users.findById(id);
    await this.redis.set(`user:${id}`, JSON.stringify(user), 'EX', 3600);
    return user;
  }
}

When to use write-through:

  • Strong consistency requirements between cache and database
  • Write operations are frequent
  • Read-heavy workloads benefit from always-fresh cache

Trade-offs:

  • Write latency increases (must update both cache and database)
  • Caches data that may never be read
  • Higher cache hit rates since cache is always populated

Write-Behind (Write-Back) Pattern

Writes go to cache immediately, then are asynchronously written to database. This provides excellent write performance but introduces complexity and potential data loss risk.

class AnalyticsRepository {
  async trackEvent(event: Event): Promise<void> {
    // Write to cache immediately (fast response)
    await this.redis.lpush(
      'analytics:queue',
      JSON.stringify(event)
    );

    // Background worker processes queue asynchronously
  }

  // Separate background worker
  async processQueue(): Promise<void> {
    while (true) {
      // Batch process events from queue
      const events = await this.redis.lrange('analytics:queue', 0, 99);

      if (events.length > 0) {
        // Batch insert to database
        await this.db.analytics.batchInsert(
          events.map(e => JSON.parse(e))
        );

        // Remove processed events
        await this.redis.ltrim('analytics:queue', 100, -1);
      }

      await new Promise(resolve => setTimeout(resolve, 1000));
    }
  }
}

When to use write-behind:

  • Write-heavy workloads (analytics, logs, metrics)
  • Can tolerate potential data loss on cache failure
  • Database write performance is a bottleneck

Trade-offs:

  • Risk of data loss if cache fails before persistence
  • More complex implementation and monitoring
  • Excellent write performance through batching

Preventing Cache Stampede

Cache stampede (thundering herd) happens when a popular cache key expires and hundreds or thousands of requests simultaneously try to regenerate it. Your database connection pool gets exhausted and everything cascades.

Here’s how to prevent it:

Probabilistic Early Expiration

Instead of waiting for cache to expire, refresh it probabilistically before expiration based on remaining TTL. This spreads out the refresh load.

async function getWithProbabilisticRefresh<T>(
  key: string,
  fetcher: () => Promise<T>,
  ttl: number,
  beta: number = 1.0
): Promise<T> {
  const result = await redis.get(key);

  if (result) {
    const data = JSON.parse(result);
    const now = Date.now();
    const timeUntilExpiry = (data.expiresAt - now) / 1000;

    // Probabilistic early refresh
    // As expiry approaches, probability of refresh increases
    const shouldRefresh =
      timeUntilExpiry / ttl < Math.random() * beta;

    if (shouldRefresh) {
      // Refresh in background without blocking
      this.backgroundRefresh(key, fetcher, ttl);
    }

    return data.value;
  }

  // Cache miss - use lock to prevent stampede
  return this.getWithLock(key, fetcher, ttl);
}

Distributed Locking

When cache misses, use Redis to coordinate who regenerates the data. Other requests wait briefly and retry.

async function getWithLock<T>(
  key: string,
  fetcher: () => Promise<T>,
  ttl: number
): Promise<T> {
  const lockKey = `lock:${key}`;

  // Try to acquire lock (10 second timeout)
  const lockAcquired = await redis.set(
    lockKey,
    '1',
    'NX', // Only set if not exists
    'EX',
    10
  );

  if (lockAcquired) {
    try {
      // We got the lock - fetch data
      const value = await fetcher();

      const data = {
        value,
        expiresAt: Date.now() + ttl * 1000,
      };

      await redis.set(
        key,
        JSON.stringify(data),
        'EX',
        ttl
      );

      return value;
    } finally {
      // Always release lock
      await redis.del(lockKey);
    }
  } else {
    // Another request is fetching - wait and retry
    await new Promise(resolve => setTimeout(resolve, 100));
    return getWithProbabilisticRefresh(key, fetcher, ttl);
  }
}

Request Coalescing

Deduplicate identical in-flight requests at the application level. If 100 requests come in for the same cache key, only one actually fetches data.

class CacheManager {
  private inflightRequests = new Map<string, Promise<any>>();

  async get<T>(
    key: string,
    fetcher: () => Promise<T>
  ): Promise<T> {
    // Check cache first
    const cached = await redis.get(key);
    if (cached) return JSON.parse(cached);

    // Check if request is already in flight
    const existing = this.inflightRequests.get(key);
    if (existing) {
      // Piggyback on existing request
      return existing;
    }

    // Create new request
    const promise = fetcher()
      .then(async value => {
        await redis.set(
          key,
          JSON.stringify(value),
          'EX',
          300
        );
        this.inflightRequests.delete(key);
        return value;
      })
      .catch(error => {
        this.inflightRequests.delete(key);
        throw error;
      });

    this.inflightRequests.set(key, promise);
    return promise;
  }
}

AWS Caching Services: When to Use What

AWS offers ElastiCache, MemoryDB, and DAX. They’re not interchangeable - each serves different use cases.

ElastiCache for Redis

Best for:

  • Session management across multiple application servers
  • General-purpose caching layer (cache-aside pattern)
  • Pub/sub messaging patterns
  • Leaderboards, rate limiting, real-time analytics

Technical specs:

  • Latency: Sub-millisecond
  • Persistence: Optional snapshots (not real-time)
  • Consistency: Eventual
  • Pricing: ~0.206/hourforcache.r6g.large(13.07GB)= 0.206/hour for cache.r6g.large (13.07 GB) = ~150/month per node
import Redis from 'ioredis';

const redis = new Redis.Cluster(
  [
    {
      host: 'redis-cluster.xxx.cache.amazonaws.com',
      port: 6379,
    },
  ],
  {
    redisOptions: {
      password: process.env.REDIS_PASSWORD,
      tls: {},
    },
    clusterRetryStrategy: times =>
      Math.min(100 * times, 3000),
    enableReadyCheck: true,
    maxRetriesPerRequest: 3,
  }
);

MemoryDB for Redis

Best for:

  • Primary database for microservices (not just cache)
  • Real-time analytics requiring durability
  • Mission-critical applications needing Redis speed + ACID guarantees
  • Financial transactions, inventory management

Technical specs:

  • Latency: Sub-millisecond reads, single-digit millisecond writes
  • Persistence: Full durable persistence via transaction log
  • Consistency: Strong (synchronous replication)
  • Multi-AZ: Automatic failover with zero data loss
  • Pricing: ~0.406/hourfordb.r6g.large= 0.406/hour for db.r6g.large = ~293/month (1.5x ElastiCache)

When to choose MemoryDB over ElastiCache:

  • Need Redis as primary database (not just cache)
  • Cannot tolerate any data loss
  • Require strong consistency guarantees
  • Want to eliminate separate database + cache architecture

DynamoDB Accelerator (DAX)

Best for:

  • DynamoDB-specific acceleration only
  • Read-heavy DynamoDB workloads (gaming leaderboards)
  • Eventually consistent reads acceptable
  • Need microsecond latency at scale

Technical specs:

  • Latency: Microseconds for cached reads
  • Integration: Native DynamoDB API compatibility
  • Consistency: Eventually consistent reads only
  • Pricing: ~$0.40/hour for dax.r4.large

Important limitations:

  • Only works with DynamoDB (not general-purpose)
  • Query/scan cache separate from get/batch-get cache
  • No strongly consistent read support
  • Cannot cache conditional updates

Decision Matrix

Yes

No

Cache

Primary DB

Yes

No

Yes

No

Need Caching?

DynamoDB Only?

DAX

Primary Database or Cache?

Data Loss OK?

MemoryDB

ElastiCache

Need Pub/Sub?

ElastiCache Redis

ElastiCache Redis or Memcached

Consistent Hashing for Distributed Caches

When you have multiple cache nodes, how do you decide which node stores which key? Simple modulo hashing (hash(key) % N) causes massive redistribution when nodes change:

  • Add server: ~50% of keys move
  • Remove server: ~50% of keys move

Consistent hashing minimizes redistribution to ~1/N of keys.

Implementation

import crypto from 'crypto';

class ConsistentHash {
  private ring: Map<number, string> = new Map();
  private sortedKeys: number[] = [];
  private virtualNodes: number = 150;

  private hash(key: string): number {
    return parseInt(
      crypto
        .createHash('md5')
        .update(key)
        .digest('hex')
        .substring(0, 8),
      16
    );
  }

  addServer(server: string): void {
    // Create virtual nodes for even distribution
    for (let i = 0; i < this.virtualNodes; i++) {
      const hash = this.hash(`${server}:vnode:${i}`);
      this.ring.set(hash, server);
      this.sortedKeys.push(hash);
    }
    this.sortedKeys.sort((a, b) => a - b);
  }

  removeServer(server: string): void {
    for (let i = 0; i < this.virtualNodes; i++) {
      const hash = this.hash(`${server}:vnode:${i}`);
      this.ring.delete(hash);
      const index = this.sortedKeys.indexOf(hash);
      if (index > -1) {
        this.sortedKeys.splice(index, 1);
      }
    }
  }

  getServer(key: string): string | undefined {
    if (this.sortedKeys.length === 0) return undefined;

    const hash = this.hash(key);

    // Binary search for next server on ring
    let idx = this.sortedKeys.findIndex(k => k >= hash);
    if (idx === -1) idx = 0; // Wrap around

    const serverHash = this.sortedKeys[idx];
    return this.ring.get(serverHash);
  }
}

// Usage
const hashRing = new ConsistentHash();
hashRing.addServer('cache-node-1');
hashRing.addServer('cache-node-2');
hashRing.addServer('cache-node-3');

const server = hashRing.getServer('user:12345');
// Returns: 'cache-node-2'

Why Virtual Nodes Matter

Without virtual nodes, simple consistent hashing can create uneven distribution. Virtual nodes (vnodes) solve this:

  • Each physical node gets 100-200 virtual nodes scattered on the ring
  • More uniform data distribution
  • Smoother load balancing when adding/removing nodes
  • Can weight servers by capacity (more vnodes = more data)
// Weight by capacity
const optimalVnodes = Math.ceil(
  150 * (serverCapacity / averageCapacity)
);

// High-capacity server gets more data
hashRing.addServer('high-capacity', 225); // 1.5x
hashRing.addServer('low-capacity', 75); // 0.5x

Multi-Tier Caching Architecture

Real performance comes from layering caches strategically. Here’s a practical three-tier architecture:

L1: In-Process Memory Cache

  • Size: 50-100 MB per instance
  • TTL: 30-60 seconds
  • Purpose: Ultra-fast access for hot data
  • Technology: LRU cache

L2: Distributed Redis Cache

  • Size: 10-100 GB cluster
  • TTL: 5-60 minutes
  • Purpose: Shared cache across instances
  • Technology: ElastiCache Redis cluster

L3: CDN Edge Cache

  • Size: Unlimited (CloudFront)
  • TTL: 1 hour - 1 year
  • Purpose: Global edge distribution
  • Technology: CloudFront

Implementation

import LRU from 'lru-cache';

class MultiTierCache {
  private l1Cache: LRU<string, any>;
  private l2Cache: Redis;

  constructor() {
    this.l1Cache = new LRU({
      max: 500, // Max items
      maxSize: 50 * 1024 * 1024, // 50 MB
      sizeCalculation: (value) => {
        return JSON.stringify(value).length;
      },
      ttl: 1000 * 60, // 1 minute
    });
  }

  async get<T>(
    key: string,
    fetcher: () => Promise<T>
  ): Promise<T> {
    // L1: Check in-memory cache
    if (this.l1Cache.has(key)) {
      return this.l1Cache.get(key);
    }

    // L2: Check Redis
    const l2Result = await this.l2Cache.get(key);
    if (l2Result) {
      const value = JSON.parse(l2Result);
      // Populate L1
      this.l1Cache.set(key, value);
      return value;
    }

    // Cache miss - fetch from origin
    const value = await fetcher();

    // Populate all cache layers
    this.l1Cache.set(key, value);
    await this.l2Cache.set(
      key,
      JSON.stringify(value),
      'EX',
      3600
    );

    return value;
  }

  async invalidate(key: string): Promise<void> {
    // Invalidate all tiers
    this.l1Cache.delete(key);
    await this.l2Cache.del(key);
  }
}

CloudFront Caching Strategies

CDN caching is different from application caching. You’re distributing content globally with long TTLs, which means invalidation strategy matters.

Cache Behavior Configuration

Different content types need different cache policies:

import * as cloudfront from 'aws-cdk-lib/aws-cloudfront';
import * as cdk from 'aws-cdk-lib';

// Static assets (images, CSS, JS)
const staticBehavior = {
  pathPattern: '/static/*',
  cachePolicy: new cloudfront.CachePolicy(
    this,
    'StaticCachePolicy',
    {
      minTtl: cdk.Duration.seconds(0),
      defaultTtl: cdk.Duration.hours(24),
      maxTtl: cdk.Duration.days(365),
      enableAcceptEncodingGzip: true,
      enableAcceptEncodingBrotli: true,
      queryStringBehavior:
        cloudfront.CacheQueryStringBehavior.none(),
      headerBehavior:
        cloudfront.CacheHeaderBehavior.none(),
      cookieBehavior:
        cloudfront.CacheCookieBehavior.none(),
    }
  ),
};

// API responses (short-lived)
const apiCacheBehavior = {
  pathPattern: '/api/public/*',
  cachePolicy: new cloudfront.CachePolicy(
    this,
    'ApiCachePolicy',
    {
      minTtl: cdk.Duration.seconds(0),
      defaultTtl: cdk.Duration.seconds(60),
      maxTtl: cdk.Duration.minutes(5),
      queryStringBehavior:
        cloudfront.CacheQueryStringBehavior.all(),
      headerBehavior:
        cloudfront.CacheHeaderBehavior.allowList(
          'Authorization'
        ),
    }
  ),
};

// Dynamic content (no cache)
const dynamicBehavior = {
  pathPattern: '/api/user/*',
  cachePolicy: cloudfront.CachePolicy.CACHING_DISABLED,
};

Invalidation Strategy

CloudFront invalidation costs add up ($0.005 per path after first 1,000/month). Use versioned URLs instead:

// Bad: Requires invalidation
const assetUrl = '/static/app.js';
await cloudfront.createInvalidation({
  DistributionId: 'E1234567890',
  InvalidationBatch: {
    CallerReference: Date.now().toString(),
    Paths: {
      Quantity: 1,
      Items: ['/static/app.js'],
    },
  },
});

// Good: Versioned URL (no invalidation needed)
const buildHash = process.env.BUILD_HASH;
const assetUrl = `/static/app.${buildHash}.js`;
// New version = new URL = automatic cache busting

Client-Side Caching with React Query

Frontend caching is often overlooked but critical for user experience. React Query (TanStack Query) provides sophisticated client-side caching with stale-while-revalidate pattern.

import {
  useQuery,
  useMutation,
  useQueryClient,
} from '@tanstack/react-query';

function UserProfile({ userId }: { userId: string }) {
  const queryClient = useQueryClient();

  // Query with caching and stale-while-revalidate
  const { data: user, isLoading } = useQuery({
    queryKey: ['user', userId],
    queryFn: () => fetchUser(userId),
    staleTime: 5 * 60 * 1000, // Fresh for 5 minutes
    gcTime: 30 * 60 * 1000, // Keep in cache for 30 minutes
    refetchOnWindowFocus: true,
    refetchOnReconnect: true,
  });

  // Mutation with optimistic updates
  const updateMutation = useMutation({
    mutationFn: (data: Partial<User>) =>
      updateUser(userId, data),

    onMutate: async newData => {
      // Cancel outgoing refetches
      await queryClient.cancelQueries({
        queryKey: ['user', userId],
      });

      // Snapshot previous value
      const previous = queryClient.getQueryData([
        'user',
        userId,
      ]);

      // Optimistically update cache
      queryClient.setQueryData(
        ['user', userId],
        (old: any) => ({
          ...old,
          ...newData,
        })
      );

      return { previous };
    },

    onError: (err, variables, context) => {
      // Rollback on error
      queryClient.setQueryData(
        ['user', userId],
        context?.previous
      );
    },

    onSettled: () => {
      // Refetch after mutation
      queryClient.invalidateQueries({
        queryKey: ['user', userId],
      });
    },
  });

  return (
    <div>
      {isLoading ? 'Loading...' : user?.name}
      <button
        onClick={() =>
          updateMutation.mutate({ name: 'New Name' })
        }
      >
        Update
      </button>
    </div>
  );
}

Prefetching for Better UX

Prefetch data before users need it for instant navigation:

function UserList() {
  const queryClient = useQueryClient();

  const { data: users } = useQuery({
    queryKey: ['users'],
    queryFn: fetchUsers,
  });

  // Prefetch on hover
  const handleUserHover = (userId: string) => {
    queryClient.prefetchQuery({
      queryKey: ['user', userId],
      queryFn: () => fetchUser(userId),
    });
  };

  return (
    <ul>
      {users?.map(user => (
        <li
          key={user.id}
          onMouseEnter={() => handleUserHover(user.id)}
        >
          <Link to={`/user/${user.id}`}>
            {user.name}
          </Link>
        </li>
      ))}
    </ul>
  );
}

Cache Monitoring and Optimization

You can’t optimize what you don’t measure. Here are the critical metrics:

Key Metrics

1. Hit Rate

class CacheMetrics {
  private hits = 0;
  private misses = 0;

  recordHit(): void {
    this.hits++;
  }

  recordMiss(): void {
    this.misses++;
  }

  getHitRate(): number {
    const total = this.hits + this.misses;
    return total === 0 ? 0 : (this.hits / total) * 100;
  }
}

Target: 85-95% depending on workload

  • Below 80%: Investigate cache key design, TTL settings
  • Formula: (hits / (hits + misses)) * 100

2. Latency Percentiles

  • P50: ~1-2ms for Redis
  • P99: Should be <10ms
  • P99.9: Alert if >50ms

3. Memory Utilization

  • Target: 70-80% usage
  • Alert: >90% (risk of evictions)

4. Eviction Rate

  • High eviction = need more memory or shorter TTLs

Monitoring Implementation

import { CloudWatch } from 'aws-sdk';

class CacheMonitor {
  private cloudwatch: CloudWatch;

  async trackMetrics(
    cacheKey: string,
    hit: boolean,
    latency: number
  ): Promise<void> {
    await this.cloudwatch
      .putMetricData({
        Namespace: 'CustomCache',
        MetricData: [
          {
            MetricName: 'CacheHitRate',
            Value: hit ? 1 : 0,
            Unit: 'Count',
            Dimensions: [
              { Name: 'CacheLayer', Value: 'Redis' },
            ],
          },
          {
            MetricName: 'CacheLatency',
            Value: latency,
            Unit: 'Milliseconds',
            Dimensions: [
              { Name: 'CacheLayer', Value: 'Redis' },
            ],
          },
        ],
      })
      .promise();
  }

  async getCacheHitRate(
    period: number = 300
  ): Promise<number> {
    const result = await this.cloudwatch
      .getMetricStatistics({
        Namespace: 'CustomCache',
        MetricName: 'CacheHitRate',
        StartTime: new Date(Date.now() - period * 1000),
        EndTime: new Date(),
        Period: period,
        Statistics: ['Average'],
        Dimensions: [
          { Name: 'CacheLayer', Value: 'Redis' },
        ],
      })
      .promise();

    return result.Datapoints?.[0]?.Average ?? 0;
  }
}

Common Pitfalls and Lessons

1. Over-Caching Dynamic Data

Caching user-specific data with long TTL leads to users seeing stale data and increased support tickets.

Solution: Classify data by volatility:

const cacheStrategies = {
  static: {
    ttl: 86400 * 7, // 1 week
    pattern: 'static:*',
  },
  config: {
    ttl: 3600, // 1 hour
    pattern: 'config:*',
  },
  userProfile: {
    ttl: 300, // 5 minutes
    pattern: 'user:*',
    invalidateOn: ['user.updated'],
  },
  realtime: {
    ttl: 0, // Don't cache
    pattern: 'inventory:*',
  },
};

2. Poor Cache Key Design

Including timestamps or random values in cache keys destroys hit rate.

// Bad: Unnecessary variability
const key = `user:${userId}:${timestamp}:${requestId}`;

// Good: Deterministic and minimal
const key = `user:${userId}`;

// Good: Include only meaningful parameters
const key = `user:${userId}:posts:${page}`;

3. Ignoring Cache Failures

Cache failure shouldn’t take down your application. Always implement fallback:

class ResilientCache {
  async get<T>(
    key: string,
    fetcher: () => Promise<T>
  ): Promise<T> {
    try {
      const cached = await Promise.race([
        redis.get(key),
        this.timeout(100), // 100ms timeout
      ]);

      if (cached) return JSON.parse(cached);
    } catch (error) {
      // Log but don't throw
      logger.warn('Cache failure, using origin', {
        key,
        error,
      });
    }

    // Fetch from origin regardless
    return fetcher();
  }
}

4. CloudFront Invalidation Abuse

Frequent invalidation racks up costs. Use versioned URLs instead:

class AssetVersioning {
  private buildHash: string;

  constructor() {
    this.buildHash =
      process.env.BUILD_HASH || Date.now().toString();
  }

  // Automatic cache busting via URL
  getAssetUrl(path: string): string {
    return `${path}?v=${this.buildHash}`;
  }
}

Cost Optimization

AWS Service Pricing (us-east-1)

ElastiCache Redis (cache.r6g.large: 13.07 GB):

  • On-Demand: 0.206/hour= 0.206/hour = ~150/month per node
  • 3-node cluster: ~$450/month

MemoryDB (db.r6g.large: 13.07 GB):

  • On-Demand: 0.406/hour= 0.406/hour = ~293/month per node
  • 3-node cluster: ~$879/month (1.5x ElastiCache)

CloudFront:

  • First 10 TB/month: $0.085/GB
  • HTTP/HTTPS requests: $0.0075 per 10,000
  • Invalidation: First 1,000 paths free, $0.005 per path after

Right-Sizing Strategy

class CacheOptimization {
  async analyzeUtilization(): Promise<Report> {
    const metrics = await this.getWeeklyMetrics();

    const avgMemoryUsage = metrics.memory.average;
    const currentCapacity = this.getCurrentCapacity();

    const recommendations = [];

    // Consistently low usage
    if (avgMemoryUsage < currentCapacity * 0.6) {
      const recommendedSize =
        this.calculateOptimalSize(metrics.memory.peak);
      const savings = this.calculateSavings(
        currentCapacity,
        recommendedSize
      );

      recommendations.push({
        type: 'DOWNSIZE',
        currentSize: currentCapacity,
        recommendedSize,
        monthlySavings: savings,
      });
    }

    // High eviction rate
    if (metrics.evictions.perDay > 1000) {
      recommendations.push({
        type: 'UPSIZE',
        reason: 'High eviction rate impacting hit rate',
        impact: 'Hit rate could improve by 15-20%',
      });
    }

    return { metrics, recommendations };
  }
}

Key Takeaways

Working with caching across multiple projects has taught me these patterns:

1. Cache patterns matter: Cache-aside for read-heavy, write-through for consistency, write-behind for write-heavy. Choose based on your actual workload.

2. Prevent stampede early: Implement distributed locking and request coalescing before you have a problem. It’s much harder to add after an incident.

3. AWS services aren’t interchangeable: ElastiCache for general caching, MemoryDB when you need durability, DAX only for DynamoDB. Don’t overpay for features you don’t need.

4. Multi-tier caching works: L1 in-memory + L2 Redis + L3 CDN provides the best performance per cost. Each layer serves a purpose.

5. Monitor continuously: Cache hit rate, latency, memory usage, and cost per request. Right-size monthly based on actual utilization.

6. Design for failure: Cache should improve performance, not become a single point of failure. Always implement graceful degradation.

7. Version URLs, don’t invalidate: CloudFront invalidation costs add up. Versioned assets are free and instant.

The difference between a 15% hit rate and 90% hit rate is often just proper cache key design and TTL management. Start with the basics, monitor everything, and optimize based on real metrics.

Related posts

Key-Value Storage Fundamentals - A Guide to Understanding and Choosing the Right Solution

A comprehensive foundational guide to key-value storage that answers four fundamental questions: What is KV storage? Where is it used? Why choose KV storage? Which tech stacks include which solutions?

redisdynamodbkey-value-storage+5
DynamoDB Rate Limiting: Strategies for Single Table Design at Scale

Practical strategies to prevent and handle DynamoDB throttling in Single Table Design applications. Covers partition key design, write sharding, capacity modes, DAX caching, retry patterns, and CloudWatch monitoring for high-throughput systems.

dynamodbawsrate-limiting+5
Edge Computing with AWS: CloudFront Functions vs Lambda@Edge

A comprehensive technical guide to choosing and implementing AWS edge computing solutions for global applications with practical examples and cost optimization strategies.

awscloudfrontlambda+6
AWS Lambda Sub-10ms Optimization: A Complete Guide

Achieve sub-10ms response times in AWS Lambda through runtime selection, database optimization, bundle size reduction, and caching strategies. Real benchmarks and production lessons included.

awslambdaperformance+7
Database Selection Guide: From Classical to Edge - A Complete Engineering Perspective

Comprehensive guide to choosing the right database for your project - covering SQL, NoSQL, NewSQL, and edge solutions with real-world implementation stories and performance benchmarks.

databasepostgresqlmysql+8