Skip to main content

AI & assistant-friendly summary

This section provides structured content for AI assistants and search engines. You can cite or summarize it when referencing this page.

Summary

Cluster upgrades and Karpenter consolidation look healthy in the console while PDB-blocked evictions freeze your node drain for 45 minutes. This guide wires minAvailable, maxUnavailable, and EKS managed node group semantics.

Key Facts

  • Cluster upgrades and Karpenter consolidation look healthy in the console while PDB-blocked evictions freeze your node drain for 45 minutes
  • This guide wires minAvailable, maxUnavailable, and EKS managed node group semantics
  • EKS (June 2026) control plane upgrades are managed, but worker disruption is yours
  • is the API that tells and Karpenter how many pods may disappear during voluntary evictions
  • Field note — FinTech API on EKS (5 replicas, copied from prod HA doc): node group upgrade stalled 52 min waiting for impossible eviction

Entity Definitions

EKS
EKS is an AWS service discussed in this article.
Kubernetes
Kubernetes is a development tool discussed in this article.

Kubernetes Pod Disruption Budgets on EKS: Zero-Downtime Upgrades

DevOps & CI/CD Palaniappan P 2 min read

Quick summary: Cluster upgrades and Karpenter consolidation look healthy in the console while PDB-blocked evictions freeze your node drain for 45 minutes. This guide wires minAvailable, maxUnavailable, and EKS managed node group semantics.

Key Takeaways

  • Cluster upgrades and Karpenter consolidation look healthy in the console while PDB-blocked evictions freeze your node drain for 45 minutes
  • This guide wires minAvailable, maxUnavailable, and EKS managed node group semantics
  • EKS (June 2026) control plane upgrades are managed, but worker disruption is yours
  • is the API that tells and Karpenter how many pods may disappear during voluntary evictions
  • Field note — FinTech API on EKS (5 replicas, copied from prod HA doc): node group upgrade stalled 52 min waiting for impossible eviction
Kubernetes Pod Disruption Budgets on EKS: Zero-Downtime Upgrades
Table of Contents

EKS (June 2026) control plane upgrades are managed, but worker disruption is yours. PodDisruptionBudget is the API that tells kubectl drain and Karpenter how many pods may disappear during voluntary evictions.

Symptom → mechanism → AWS control

Production symptomMechanismAWS control
503s during node drainVoluntary disruption evicts all podsPodDisruptionBudget minAvailable or maxUnavailable
PDB blocks node upgrade indefinitelyToo-strict minAvailable=100%minAvailable=80% with HPA headroom
Single-replica deploy has no PDB effectPDB requires ≥2 replicasHPA minReplicas=2 for production tiers

Opinionated take: Every production Deployment needs a PDB and minReplicas≥2—EKS managed node upgrades will evict your pods whether you’re ready or not.

Benchmark pattern (hypothetical workload) — EKS node group upgrade without PDB: 18s API outage; with PDB minAvailable=80% on 10-replica Deployment: 0 failed requests during 15-min rolling node drain; cluster-autoscaler respects PDB evictions.

Field note — FinTech API on EKS (5 replicas, minAvailable: 5 copied from prod HA doc): node group upgrade stalled 52 min waiting for impossible eviction. Changing to maxUnavailable: 1 allowed rolling drain; error rate stayed <0.01%. Pair with blue-green decision guide.

PDB mechanics

  • Voluntary disruptions: node drain, kubectl delete pod, Karpenter consolidation.
  • Involuntary: hardware failure, spot interrupt—PDB does not block.
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-pdb
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      app: api

EKS-specific coupling

EventPDB interaction
Managed node group rolling updateSequential node replacement; PDB gates pod eviction
Karpenter drift / consolidationEvictions must satisfy PDB
FargateNo DaemonSets; PDB still applies to Fargate pods

AWS services map

NeedServiceSkip when
Managed K8s upgradesEKS managed node groupsSelf-managed ASG with manual drain
Disruption controlPDB + EKS Pod IdentitySingle-replica dev namespaces
Surge during deployDeployment maxSurge=25%StatefulSet with strict ordering

When this advice breaks

  • Single-replica Deployments — PDB cannot invent HA; fix replica count first.
  • Jobs/CronJobs — PDB usually irrelevant.

What to do this week

  1. Audit workloads with kubectl get pdb -A and replica counts.
  2. Replace minAvailable: 100% with maxUnavailable: 1 for rolling services.
  3. Run controlled drain on one node during low traffic; watch kube_pod_status_ready.
  4. Align cluster upgrade window with Karpenter how-to.

More in This Track

Part of the Engineering Guides library (June 2026).

What this guide doesn’t cover

Service mesh traffic shifting—part 3 of this track.

PP
Palaniappan P

AWS Cloud Architect & AI Expert

AWS-certified cloud architect and AI expert with deep expertise in cloud migrations, cost optimization, and generative AI on AWS.

AWS ArchitectureCloud MigrationGenAI on AWSCost OptimizationDevOps

Recommended Reading

Explore All Articles »