The 2026 DevOps Tool Stack That Actually Ships
After deploying production infrastructure for years, here's the boring, working stack we've converged on — and why the trendy alternatives keep losing.
After deploying production infrastructure for years, here's the boring, working stack we've converged on. Nothing trendy. Nothing experimental. Every choice has earned its place by surviving incidents.
Compute
DigitalOcean Droplets for managed-friendly workloads. Hetzner for cost-sensitive ones. AWS only when compliance forces it. The headline argument for AWS — 'feature breadth' — only matters at scales most teams will never hit. The hidden tax of AWS (NAT Gateway, EBS, egress, EKS control plane) is real and compounds monthly.
Orchestration
Nomad if the team is small. Kubernetes if you're committed. Most teams using Kubernetes are using it prematurely — they have 5 services and a service mesh. Nomad runs the same containers with 1/10 the operational load. The reason most teams pick K8s is the talent pool argument: hiring K8s engineers is easier than hiring Nomad engineers. Fair.
CI/CD
GitHub Actions, with self-hosted runners on Hetzner for high-volume teams. The marketplace, the integration with the platform you already live in, the OIDC-to-AWS pattern that eliminates long-lived secrets. Not perfect, but boring in the right way.
Observability
Prometheus + Grafana + Loki + Tempo on a $20/mo VPS. If your bill on Datadog is over $1,000/mo, you can run this stack for $20-100/mo and have a better experience. The catch is cardinality discipline — if you don't enforce it, the metric store will eat your VPS. We have a separate piece on the cardinality runbook.
IaC
OpenTofu, with state in S3-compatible storage. Terraform's BSL relicense was a forcing function. Two years on, OpenTofu has shipped real features Terraform hasn't. Migration is mechanical for most projects.
Secrets
SOPS + age for small teams. AWS Secrets Manager / GCP Secret Manager for mid-size. Vault when the compliance pressure justifies it. Most teams never need Vault. The honest answer is the cloud provider's secret manager, with a clean rotation cadence.
DNS / CDN / WAF
Cloudflare. Free tier is the right answer for most teams. The bottom of the CDN/DNS/WAF market got eaten and it's been good for everyone. Workers for edge logic, Tunnels to replace VPNs, Pages for static sites — Cloudflare's surface area in 2026 is enormous and most of it is free.
Database
Postgres. Always. Greenfield. Existing project that's outgrowing its NoSQL choice. Edge-distributed system. The answer is Postgres. Extensions handle the cases where 'but I need MongoDB' was the previous answer (jsonb) or 'but I need Elasticsearch' was the previous answer (pg_trgm + GIN indexes for most use cases). Vector DB? pgvector. Geospatial? PostGIS.
Cache / Queue
Redis. Always. Cache, sessions, rate limiters, queues, leaderboards, pub/sub. Memcached is faster for pure caching but the operational simplicity of one tool that does five jobs wins.
The Pattern
Notice what's missing: no service mesh, no event-streaming platform, no graph database, no time-series database, no message broker beyond Redis. That's not because those tools are bad — it's because they're additive complexity that most teams don't need. The 2026 stack that ships is shorter than the 2022 stack. The maturity of the boring tools means you carry less.
What We Got Wrong in 2022
- Started on Kubernetes too early. Service count was 5; we should have stayed on Docker Compose for 2 more years.
- Chose Datadog because everyone did. Bill hit $4k/mo before we audited it. Replaced with Prometheus stack in two days.
- Used MongoDB because it was 'easier to start with.' Migrated to Postgres when the second engineer joined. Should have started there.
What We're Watching in 2026
- Kamal 2 (formerly MRSK) for teams that want bare-metal Docker deploys without K8s.
- Cloudflare Containers and Workers as a serverless contender that doesn't lock you into AWS.
- vLLM multi-LoRA for self-hosted LLM inference at scale.
- OpenTofu's roadmap — if the velocity holds, the Terraform-vs-OpenTofu question is closed by end of year.
The Honest Conclusion
The DevOps stack that ships in 2026 looks more like the stack that shipped in 2018 than the stack that shipped in 2022. The hype cycle on service meshes, event-driven everything, and microservices-by-default has corrected. The boring tools won. The ones that ship are the ones that don't fight you at 3am.
This is part of the DevOps Ninja cornerstone series. Honest critique welcome.