Distributed Locking, Redlock, and Consistent Hashing on AWS
Quick summary: Redlock debates matter because ElastiCache is not a consensus system. Consistent hashing for sharding workers and ALB target stickiness—with DynamoDB conditional writes as the boring alternative.
Key Takeaways
- Consistent hashing for sharding workers and ALB target stickiness—with DynamoDB conditional writes as the boring alternative
- Martin Kleppmann’s critique (still valid June 2026): Redis-based Redlock does not provide the same guarantees as Raft—clock skew and TTL expiry create false confidence
- 02% under 200ms network partition vs 0% with DynamoDB lease table
- Replace cron leader with DynamoDB conditional put
- 2
Table of Contents
Martin Kleppmann’s critique (still valid June 2026): Redis-based Redlock does not provide the same guarantees as Raft—clock skew and TTL expiry create false confidence.
Symptom → mechanism → AWS control
| Production symptom | Mechanism | AWS control |
|---|---|---|
| Double booking under partition | Redlock lacks fencing tokens | DynamoDB conditional PutItem with lease TTL |
| Hot key on lock shard | Consistent hashing skew | ElastiCache Serverless auto-sharding, key-space partitioning |
| Lock holder crashes without release | TTL-based lease expiry | SQS visibility timeout pattern as lock alternative |
Opinionated take: Skip Redlock for production inventory—use DynamoDB conditional leases or SQS single-consumer patterns unless you have fencing tokens wired end-to-end.
Benchmark pattern (hypothetical workload) — inventory reservation service using DynamoDB conditional locks vs Redlock on 3-node ElastiCache cluster, 5K lock acquisitions/sec, Redlock false-positive releases at 0.02% under 200ms network partition vs 0% with DynamoDB lease table.
AWS-aligned coordination
| Need | Prefer | Avoid |
|---|---|---|
| Mutex | DynamoDB conditional update with lease attribute | Redlock across ElastiCache primary/replica |
| Leader election | ECS/EKS lease + DynamoDB lock item | Infinite TTL without fencing token |
| Work routing | Consistent hash on hash(user_id) % N | Random assign breaking cache locality |
Consistent hashing minimizes remapping when nodes change—use for worker pools, Kafka partition keys, ALB sticky sessions (with failover caveats).
ElastiCache locking
If you must use Redis locks: short TTL, fencing tokens passed to downstream (DB version check), monitor WAIT latency.
AWS services map
| Need | Service | Skip when |
|---|---|---|
| Fencing-token-safe distributed lock | DynamoDB conditional writes | Single-instance PostgreSQL advisory locks |
| Session affinity routing | ElastiCache consistent hashing | Stateless Lambda with no shared state |
| Leader election | SQS FIFO + single consumer | Kubernetes Lease API on EKS suffices |
What to do this week
- Replace
SETNXcron leader with DynamoDBLockIDconditional put. - Document hash ring behavior on worker scale events.
- Load test lock holder crash during write.
More in This Track
Part of the Engineering Guides library (June 2026).
- Previous: Part 2
- Next: Part 4
- Browse tracks: Engineering Guides hub
What this guide doesn’t cover
Paxos/Raft theory—next guide in track.
AWS Cloud Architect & AI Expert
AWS-certified cloud architect and AI expert with deep expertise in cloud migrations, cost optimization, and generative AI on AWS.