DevOps & CI/CD

DevOps & CI/CD Part 4

Jun 12, 2026 palaniappan p 1 min

Container Runtime Security: seccomp, AppArmor, and EKS Pod Security

Default Docker seccomp is not enough for regulated workloads. EKS Pod Security Standards, seccomp profiles, and Fargate platform version constraints.
engineering-guide
security
kubernetes
eks
aws
Read article
DevOps & CI/CD Part 2

Jun 12, 2026 palaniappan p 2 min

Kubernetes Pod Disruption Budgets on EKS: Zero-Downtime Upgrades

Cluster upgrades and Karpenter consolidation look healthy in the console while PDB-blocked evictions freeze your node drain for 45 minutes. This guide wires minAvailable, maxUnavailable, and EKS managed node group semantics.
engineering-guide
kubernetes
eks
devops
aws
Read article
DevOps & CI/CD Part 3

Jun 12, 2026 palaniappan p 2 min

Log Aggregation and Intelligent Sampling with CloudWatch and OpenTelemetry

Ingesting every debug log to CloudWatch is how observability becomes a FinOps incident. Tail sampling with ADOT, Logs Insights, and Firehose to S3 for the long tail.
engineering-guide
observability
cloudwatch
opentelemetry
aws
Read article
DevOps & CI/CD Part 2

Jun 12, 2026 palaniappan p 2 min

Prometheus Cardinality Explosion on AWS: AMP, EMF, and Cost-Aware Metrics

That `user_id` label on every HTTP metric turns Amazon Managed Prometheus into a five-figure line item. This guide explains cardinality mechanics, EMF vs remote write, and Application Signals defaults worth disabling.
engineering-guide
observability
prometheus
cloudwatch
aws
finops
Read article
DevOps & CI/CD Part 3

Jun 12, 2026 palaniappan p 2 min

Service Mesh Traffic Shifting: VPC Lattice, Istio on EKS, and App Mesh EOL

App Mesh is legacy path—new meshes should start with VPC Lattice for AWS-native east-west or Istio on EKS when you need full L7 policy. Traffic shifting without duplicating load balancers per service.
engineering-guide
kubernetes
eks
service-mesh
aws
Read article
DevOps & CI/CD

Jun 11, 2026 palaniappan p 5 min

AWS DevOps & Platform Maturity Model (2026): A 4-Level Scorecard Anchored to Real Services

Generic DevOps maturity models score you on culture slides — this one maps L1–L4 to AWS gates you can verify: IaC in Git, GitOps or gated CD, ADOT on EKS, FIS with stop conditions, and cost-aware CI. A composite 85-engineer SaaS moved from L2 to L3 in one quarter by fixing the CI/GitOps boundary alone, cutting deploy-related incidents from ~6/month to 2.
aws
devops
platform-engineering
cicd
gitops
observability
chaos-engineering
Read article
DevOps & CI/CD Part 6

Jun 10, 2026 palaniappan p 5 min

From One FIS Experiment to a Resilience Program (2026): AWS Fault Injection Service, Stop Conditions, and GameDays That Actually Change Behavior

Running one AWS FIS experiment in a demo account is not chaos engineering — it is a screenshot. A program ties experiments to SLOs, scopes blast radius with tags, halts on CloudWatch alarm stop conditions, schedules via EventBridge, and closes the loop by re-testing the fix. FIS now ships AZ Power Interruption and cross-Region connectivity scenarios in its Scenario Library. Here is the L0→L3 maturity matrix, a GameDay runbook, and a stop-condition-wired experiment skeleton.
aws
chaos-engineering
resilience
aws-fis
reliability
engineering-guide
Read article
DevOps & CI/CD

Jun 10, 2026 palaniappan p 5 min

GitOps on Amazon EKS (2026): Argo CD vs Flux, App-of-Apps, and the Decisions That Actually Bite

AWS Prescriptive Guidance says Argo CD and Flux both handle most GitOps scenarios capably — so picking one is a fit decision, not a winner. The decisions that actually cause incidents are the ones underneath: plaintext secrets in the GitOps repo, CI running kubectl apply and reintroducing drift, no App-of-Apps so onboarding is click-ops, and repo topology you can't change later. Here is the Argo CD vs Flux matrix, an App-of-Apps example, and the five traps independent of tool.
aws
eks
gitops
kubernetes
argocd
Read article
DevOps & CI/CD Part 1

Jun 9, 2026 palaniappan p 7 min

Observability Beyond CloudWatch (2026): When to Add Application Signals, ADOT, Managed Prometheus, and Grafana — and When Not To

The reflex to bolt Amazon Managed Prometheus + Grafana onto every workload is how observability bills quietly double. CloudWatch Application Signals now gives you an auto-discovered service map, SLOs, and traces with near-zero setup; AMP only earns its keep when you are PromQL-native or drowning in high-cardinality metrics — where ingestion (not retention) is the cost driver. Here is the decision matrix, an ADOT dual-export config, and the three levers that actually cut the AMP bill.
aws
observability
opentelemetry
devops
cost-optimization
engineering-guide
Read article
DevOps & CI/CD Part 1

May 29, 2026 palaniappan p 5 min

Blue/Green vs Canary on AWS (2026): ECS, Lambda, and When Rolling Is Enough

ECS CodeDeploy and Lambda aliases support both instant cutover and gradual shifts—but picking wrong costs you double Fargate spend or 21-day MTTR on muted alarms. This decision guide scores blue/green, canary, and rolling with a matrix and names App Mesh (EOL Sept 30, 2026) replacements.
codedeploy
ecs
lambda
deployments
devops
aws
engineering-guide
Read article
DevOps & CI/CD

May 1, 2026 palaniappan p 15 min

The Terraform Command Cheat Sheet for AWS Engineers (2026 Edition)

Every Terraform command you actually need on AWS — modernized for Terraform 1.10+, with deprecated commands flagged and AWS-specific gotchas for state, workspaces, providers, and the new import/removed/ephemeral primitives.
terraform
aws
iac
devops
cheat-sheet
command-reference
Read article
DevOps & CI/CD

Apr 30, 2026 palaniappan p 21 min

Terraform + Claude Skills on AWS: A Production Walkthrough (and 5 Things It Still Won't Do for You)

Anton Babenko's Terraform Claude Skill is the biggest jump in AI-assisted IaC since Copilot. We tested it on a real AWS stack — VPC, EKS, S3 + KMS, IAM — and documented exactly what it fixes, what it misses, and what AWS teams should layer on top.
terraform
claude-code
claude-skills
infrastructure-as-code
aws-devops-services
devops
ai-coding-agents
gitops
well-architected
Read article

Older posts