AWS ElastiCache Redis Caching Strategies

Caching is the most cost-effective way to improve application performance. A single Redis cache node can serve hundreds of thousands of reads per second with sub-millisecond latency — orders of magnitude faster than any database query. For applications bottlenecked by database read latency or struggling under read-heavy traffic patterns, Redis caching transforms performance without re-architecting the application.

Symptom → mechanism → AWS control

Production symptom	Mechanism	AWS control
Cache stampede on expiry	Thundering herd to origin	ElastiCache TTL jitter, read-through with mutex
Hot key throttling	Single shard saturation	ElastiCache Serverless auto-scaling, local in-process L1
Stale reads after write	Cache-aside without invalidation	Write-through or pub/sub invalidation on ElastiCache

Opinionated take: ElastiCache Serverless for variable workloads, provisioned cluster mode when you can forecast RPS—always add TTL jitter on cache-aside.

Benchmark pattern (hypothetical workload) — ElastiCache Redis 7 cluster mode, cache-aside for product catalog, 94% hit rate, origin Aurora queries drop from 8K to 480/sec, p99 API latency 45ms→8ms, cluster cost $890/month vs $2.1K Aurora scale-up avoided.

June 2026 refresh: ElastiCache Serverless and managed Valkey offerings change provisioning math—confirm engine choice against HA/replica requirements rather than assuming shard-count defaults from older Redis OSS guides.

AWS ElastiCache for Redis provides managed Redis clusters that handle replication, failover, patching, and backup — the operational tasks that make self-managed Redis painful at scale. This guide covers the caching strategies and ElastiCache configurations that work in production.

2025 Update: AWS now offers Amazon ElastiCache for Valkey alongside ElastiCache for Redis. New workloads should consider Valkey — it’s the open-source successor to Redis 7.2, maintained by the Linux Foundation, and is now the default engine for new ElastiCache clusters. See our Valkey migration guide for details on migration paths and compatibility.

When to Use Caching

Caching Makes Sense When

Read-heavy workloads — Your application reads far more than it writes (10:1 or higher read-to-write ratio)
Expensive queries — Database queries involve joins, aggregations, or full-text search that take 50ms+
Repeated access patterns — The same data is requested by multiple users (product pages, configuration, leaderboards)
Latency requirements — Your API must respond in under 50ms, and database queries take longer
Database bottleneck — Your RDS or DynamoDB read capacity is saturated and scaling the database is expensive

Caching Does Not Help When

Write-heavy workloads — If every request writes unique data, caching adds complexity without benefit
Unique queries — If every query is different (ad-hoc analytics, search with unique parameters), cache hit rates will be low
Strong consistency requirements — If stale data is never acceptable, caching introduces consistency complexity

Caching Patterns

Pattern 1: Cache-Aside (Lazy Loading)

The most common pattern — the application checks the cache first and falls back to the database on cache miss:

Read request
  → Check Redis cache
    → Cache hit → Return cached data (sub-millisecond)
    → Cache miss → Query database → Store result in Redis → Return data

Advantages:

Only caches data that is actually requested (no wasted memory)
Cache failures do not break the application (falls back to database)
Simple to implement

Disadvantages:

First request for each item hits the database (cold cache)
Stale data possible if database is updated without invalidating cache
Cache stampede risk when many concurrent requests miss the cache simultaneously

Implementation considerations:

Set a TTL (time-to-live) on every cached item to limit staleness
Implement cache invalidation on write operations
Use a mutex/lock for expensive queries to prevent cache stampede

Pattern 2: Write-Through

Write to both the cache and database simultaneously:

Write request
  → Write to Redis cache
  → Write to database
  → Return success

Advantages:

Cache is always up to date with the database
No stale data
Read requests always hit the cache (after initial population)

Disadvantages:

Every write has the overhead of two operations (cache + database)
Data that is written but never read still consumes cache memory
Cache contains data that may never be requested

Best for: Data that is frequently read after being written (user profiles, session data, configuration).

Pattern 3: Write-Behind (Write-Back)

Write to the cache immediately and asynchronously write to the database:

Write request
  → Write to Redis cache → Return success immediately
  → Background process → Write to database (async)

Advantages:

Lowest write latency (only cache write is synchronous)
Batches database writes for efficiency
Absorbs write spikes without database overload

Disadvantages:

Data loss risk if Redis fails before database write completes
Complex consistency management
Requires reliable background processing

Best for: High-throughput write workloads where slight data loss is acceptable (analytics counters, activity feeds, non-critical metrics).

Pattern 4: Read-Through with TTL Refresh

Automatically refresh cached data before TTL expires:

Background process
  → Scan for items approaching TTL expiry
  → Re-query database for fresh data
  → Update cache with fresh data
  → Users always see cached data (never hit database)

Best for: High-traffic items (homepage content, product catalogs) where cache misses cause noticeable latency and database load.

Redis Data Structures for Caching

Redis provides data structures beyond simple key-value storage. Choosing the right structure improves efficiency:

Data Structure	Use Case	Example
String	Simple key-value cache	User profile, API response, session data
Hash	Object with multiple fields	User: {name, email, role, lastLogin}
List	Ordered collection, recent items	Activity feed, recent orders
Set	Unique collection, membership	Online users, unique visitors
Sorted Set	Ranked collection	Leaderboard, trending products
Stream	Event log, message queue	Activity stream, change notifications

Practical Examples

Session storage (Hash):

HSET session:abc-123 userId "user-001" role "admin" tenant "acme" expiresAt "1720000000"
EXPIRE session:abc-123 3600

Leaderboard (Sorted Set):

ZADD leaderboard 1500 "player-001"
ZADD leaderboard 2300 "player-002"
ZREVRANGE leaderboard 0 9 WITHSCORES  # Top 10 players

Rate limiting (String with INCR):

INCR rate:user-001:2026-08-10T14:30
EXPIRE rate:user-001:2026-08-10T14:30 60  # 1-minute window
# Check: if count > 100, reject request

ElastiCache Configuration

Cluster Modes

Cluster Mode Disabled (single shard):

One primary node + up to 5 read replicas
All data on a single shard (limited by single node memory)
Simpler to manage
Max memory: 635.61 GB (r7g.16xlarge)

Cluster Mode Enabled (multiple shards):

Data partitioned across up to 500 shards
Each shard has a primary + up to 5 replicas
Total memory = shards × node memory (theoretically unlimited)
Supports online resharding (add/remove shards without downtime)

When to use Cluster Mode Enabled:

Dataset exceeds single node memory
Write throughput exceeds single primary capacity
You need online scaling (adding shards without downtime)

When Cluster Mode Disabled is sufficient:

Dataset fits in a single node
Read scaling via replicas is sufficient
Simpler operations preferred

Node Types

Category	Example	Use Case
General Purpose (m7g)	cache.m7g.large	Balanced workloads, most production use cases
Memory Optimized (r7g)	cache.r7g.xlarge	Large datasets, high memory-to-CPU ratio
Small/Dev (t4g)	cache.t4g.micro	Development, testing, low-traffic production

Graviton (g suffix) instances provide 20-30% better price-performance than equivalent Intel instances. Always use Graviton for new deployments.

High Availability

Multi-AZ with automatic failover — Always enable for production. If the primary node fails, ElastiCache automatically promotes a replica to primary (failover time: typically 10-30 seconds).
Read replicas — Scale read capacity horizontally. Your application reads from replicas and writes to the primary.
Global Datastore — Cross-Region replication for disaster recovery and low-latency global reads.

Cache Invalidation

Cache invalidation is the hardest problem in caching. Stale data causes bugs; aggressive invalidation reduces cache hit rates.

TTL-Based Expiry

Set a TTL on every cached item:

Data Type	Recommended TTL	Rationale
Configuration	5-15 minutes	Changes infrequently, slight staleness acceptable
User profile	1-5 minutes	Changes occasionally, brief staleness tolerable
Product catalog	15-60 minutes	Changes via admin updates, not user-facing mutations
API response	30-300 seconds	Depends on data freshness requirements
Session data	30-60 minutes	Match session timeout policy

Event-Based Invalidation

Invalidate cache entries when the underlying data changes:

Database write (DynamoDB Stream / RDS event)
  → Lambda function
  → Delete or update Redis cache entry

For DynamoDB, use DynamoDB Streams to trigger Lambda functions that invalidate corresponding cache entries. For RDS, use event notifications or application-level invalidation.

Tag-Based Invalidation

Group related cache entries with tags for bulk invalidation:

Cache entry: product:123 → tags: ["catalog", "category:electronics"]
Cache entry: product:456 → tags: ["catalog", "category:electronics"]

Invalidate: all entries tagged "category:electronics"
→ Deletes product:123 and product:456 simultaneously

Implement with Redis Sets: maintain a set per tag containing all keys associated with that tag.

ElastiCache Serverless

ElastiCache Serverless removes capacity planning entirely:

Automatically scales memory and compute based on usage
No node selection, no cluster management
Pay for data stored (per GB-hour) and compute (per ECPU)
Minimum charge applies ($0.125/hour ≈ $90/month)

When to use Serverless:

Unpredictable or spiky traffic patterns
New applications where cache sizing is unknown
Teams that want to avoid capacity planning

When to use provisioned nodes:

Predictable workloads where node sizing is known
Cost optimization with Reserved Nodes (up to 55% savings)
Requirements for specific node types or cluster configurations

Monitoring

Key CloudWatch Metrics

Metric	Target	Action If Outside Target
CacheHitRate	> 80%	Low hit rate = wrong caching strategy or TTL
EngineCPUUtilization	< 70%	Scale up or add shards
DatabaseMemoryUsagePercentage	< 80%	Scale up or review eviction policy
CurrConnections	Below max	Connection pooling issue if near limit
ReplicationLag	< 1 second	Network or replica capacity issue
Evictions	Near zero	Memory pressure if evictions increase

Set CloudWatch alarms for:

EngineCPUUtilization > 70% — Scale before performance degrades
DatabaseMemoryUsagePercentage > 80% — Scale before evictions begin
CacheHitRate < 50% — Investigate caching strategy

Cost Optimization

Right-Sizing

Monitor DatabaseMemoryUsagePercentage over 2 weeks. If consistently below 50%, you are paying for unused memory. Downsize to a smaller node type.

Reserved Nodes

For steady-state production caches, Reserved Nodes provide significant savings:

Payment Option	1-Year Savings	3-Year Savings
No upfront	~28%	~41%
Partial upfront	~35%	~50%
All upfront	~38%	~55%

Data Tiering

ElastiCache data tiering automatically moves less-frequently accessed data to SSD storage, reducing memory costs for large datasets:

Hot data stays in memory (sub-millisecond latency)
Warm data moves to SSD (single-digit millisecond latency)
Available on r6gd, r7gd, and r8gd node types

Common Mistakes

Mistake 1: Caching Without TTL

Cached data without a TTL lives forever — becoming stale as the source database changes. Always set a TTL. If you are unsure, start with 5 minutes and adjust based on your data’s change frequency and tolerance for staleness.

Mistake 2: No Connection Pooling

Creating a new Redis connection for every request is expensive. Use connection pooling in your application. For Lambda, initialize the Redis connection outside the handler function to reuse connections across invocations.

Mistake 3: Using Redis as Primary Storage

Redis is a cache, not a database. If your application cannot function when Redis is empty (cold start, failover, eviction), you have a cache dependency, not a caching strategy. Every cached item must be retrievable from the primary data store.

Mistake 4: Caching Too Much

Not all data benefits from caching. Data accessed once (unique search results, one-time API calls) wastes cache memory. Focus caching on frequently accessed, expensive-to-compute, or slowly changing data.

Getting Started

ElastiCache Redis fills the performance gap between your application and your database. For read-heavy serverless applications, high-traffic APIs, and latency-sensitive workloads, a well-implemented caching layer provides the single largest performance improvement available.

For caching architecture design, ElastiCache configuration, and performance optimization as part of our architecture review or managed services, talk to our team.

AI & assistant-friendly summary

Summary

Key Facts

Entity Definitions

Related Content

Symptom → mechanism → AWS control

When to Use Caching

Caching Makes Sense When

Caching Does Not Help When

Caching Patterns

Pattern 1: Cache-Aside (Lazy Loading)

Pattern 2: Write-Through

Pattern 3: Write-Behind (Write-Back)

Pattern 4: Read-Through with TTL Refresh

Redis Data Structures for Caching

Practical Examples

ElastiCache Configuration

Cluster Modes

Node Types

High Availability

Cache Invalidation

TTL-Based Expiry

Event-Based Invalidation

Tag-Based Invalidation

ElastiCache Serverless

Monitoring

Key CloudWatch Metrics

Cost Optimization

Right-Sizing

Reserved Nodes

Data Tiering

Common Mistakes

Mistake 1: Caching Without TTL

Mistake 2: No Connection Pooling

Mistake 3: Using Redis as Primary Storage

Mistake 4: Caching Too Much

Getting Started

More in This Track

Related reading

Related AWS Services

AWS Architecture Review

AWS Serverless

AWS Migration

Recommended Reading

Distributed Cache Invalidation and Multi-Level Caching on AWS

Bloom Filters and HyperLogLog in Production on ElastiCache Redis

AWS RDS Performance and Caching: IOPS, Query Tuning, and Application-Layer Cache Patterns

How to Use Redis and Valkey as a Cost-Saving Layer (Not Just Cache)