Container Orchestration: Docker Compose vs Kubernetes vs Nomad vs Swarm
Container Orchestration: Docker Compose vs Kubernetes vs Nomad vs Swarm
Container orchestration is the most over-engineered decision in modern software. Teams running three containers reach for Kubernetes. Startups with one server set up Helm charts. The right tool depends on your scale, and most projects need far less than they think. This guide covers the spectrum from Docker Compose to Kubernetes, with honest advice about where each tool makes sense.
The Orchestration Spectrum
| Tool | Complexity | Best Scale | Learning Curve | Operational Overhead |
|---|---|---|---|---|
| Docker Compose | Low | 1 server, 1-20 containers | Hours | Minimal |
| Docker Swarm | Low-Medium | 2-10 servers | Days | Low |
| HashiCorp Nomad | Medium | 5-500 servers | Weeks | Medium |
| Kubernetes (K8s) | High | 10-10,000+ servers | Months | High |
The key insight: you can go very far with Docker Compose. Most startups and small-to-medium applications don't need anything more. Orchestration tools solve problems of scale -- if you don't have scale problems, the tool is adding complexity for free.
Docker Compose: The Practical Default
Docker Compose manages multi-container applications on a single host. It's the right choice for most development environments and many production deployments.
Production-Ready Docker Compose
# docker-compose.yml
services:
app:
build:
context: .
dockerfile: Dockerfile
target: production
ports:
- "3000:3000"
environment:
- DATABASE_URL=postgresql://app:secret@db:5432/myapp
- REDIS_URL=redis://redis:6379
- NODE_ENV=production
depends_on:
db:
condition: service_healthy
redis:
condition: service_healthy
restart: unless-stopped
deploy:
resources:
limits:
cpus: "2.0"
memory: 1G
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
worker:
build:
context: .
dockerfile: Dockerfile
target: production
command: ["bun", "run", "worker.ts"]
environment:
- DATABASE_URL=postgresql://app:secret@db:5432/myapp
- REDIS_URL=redis://redis:6379
depends_on:
db:
condition: service_healthy
redis:
condition: service_healthy
restart: unless-stopped
deploy:
replicas: 2
resources:
limits:
cpus: "1.0"
memory: 512M
db:
image: postgres:16-alpine
volumes:
- postgres_data:/var/lib/postgresql/data
- ./init.sql:/docker-entrypoint-initdb.d/init.sql
environment:
- POSTGRES_USER=app
- POSTGRES_PASSWORD=secret
- POSTGRES_DB=myapp
restart: unless-stopped
healthcheck:
test: ["CMD-SHELL", "pg_isready -U app -d myapp"]
interval: 10s
timeout: 5s
retries: 5
deploy:
resources:
limits:
cpus: "2.0"
memory: 2G
redis:
image: redis:7-alpine
volumes:
- redis_data:/data
command: redis-server --appendonly yes --maxmemory 256mb --maxmemory-policy allkeys-lru
restart: unless-stopped
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
caddy:
image: caddy:2-alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./Caddyfile:/etc/caddy/Caddyfile
- caddy_data:/data
- caddy_config:/config
depends_on:
- app
restart: unless-stopped
volumes:
postgres_data:
redis_data:
caddy_data:
caddy_config:
Multi-Stage Dockerfile
# Dockerfile
FROM oven/bun:1 AS base
WORKDIR /app
# Install dependencies
FROM base AS deps
COPY package.json bun.lockb ./
RUN bun install --frozen-lockfile --production
# Build
FROM base AS build
COPY package.json bun.lockb ./
RUN bun install --frozen-lockfile
COPY . .
RUN bun run build
# Production
FROM base AS production
COPY --from=deps /app/node_modules ./node_modules
COPY --from=build /app/dist ./dist
COPY package.json ./
USER bun
EXPOSE 3000
CMD ["bun", "run", "dist/server.js"]
Zero-Downtime Deploys with Docker Compose
#!/bin/bash
# deploy.sh -- zero-downtime deploy with Docker Compose
set -euo pipefail
echo "Building new image..."
docker compose build app
echo "Starting new containers..."
docker compose up -d --no-deps --scale app=2 app
echo "Waiting for new containers to be healthy..."
sleep 30
echo "Removing old containers..."
docker compose up -d --no-deps --scale app=1 app
echo "Deploy complete."
When Docker Compose Is Enough
- Single server with 1-20 containers
- Startups and small teams (under 10 engineers)
- Applications with < 10,000 concurrent users (depending on workload)
- Development and staging environments for any size project
- Side projects and internal tools
When You've Outgrown Docker Compose
- You need to run across multiple servers for high availability
- A single server can't handle your traffic
- You need automatic failover when a server dies
- You need to scale specific services independently across machines
Docker Swarm: Multi-Host Without the Complexity
Docker Swarm is Docker's built-in orchestration. It extends Docker Compose syntax to work across multiple servers. If you've outgrown a single server but aren't ready for Kubernetes, Swarm is the gentlest next step.
Setting Up a Swarm
# On the manager node
docker swarm init --advertise-addr 10.0.1.1
# On worker nodes (using the token from the init output)
docker swarm join --token SWMTKN-1-xxx 10.0.1.1:2377
# Check cluster status
docker node ls
Deploying a Stack
# stack.yml (Swarm-compatible Compose file)
services:
app:
image: registry.example.com/myapp:latest
deploy:
replicas: 3
update_config:
parallelism: 1
delay: 30s
order: start-first # New container starts before old one stops
failure_action: rollback
rollback_config:
parallelism: 1
restart_policy:
condition: on-failure
max_attempts: 3
placement:
constraints:
- node.role == worker
resources:
limits:
cpus: "1.0"
memory: 512M
ports:
- "3000:3000"
networks:
- app-network
db:
image: postgres:16-alpine
deploy:
replicas: 1
placement:
constraints:
- node.labels.db == true # Pin to specific node
volumes:
- postgres_data:/var/lib/postgresql/data
networks:
- app-network
networks:
app-network:
driver: overlay
volumes:
postgres_data:
# Deploy the stack
docker stack deploy -c stack.yml myapp
# Check service status
docker service ls
docker service ps myapp_app
# Scale a service
docker service scale myapp_app=5
# Rolling update
docker service update --image registry.example.com/myapp:v2 myapp_app
# Rollback
docker service update --rollback myapp_app
Swarm Strengths
- Uses Docker Compose syntax -- minimal learning curve
- Built into Docker -- no additional software to install
- Built-in load balancing via routing mesh
- Rolling updates and rollbacks out of the box
- Service discovery via DNS
Swarm Limitations
- Limited ecosystem -- fewer tools and integrations than Kubernetes
- No autoscaling -- must scale manually or script it
- Declining community -- Docker has shifted focus to Desktop and Hub
- Simpler scheduling -- no pod affinity, taints, or tolerations
- No native ingress -- need to add Traefik or similar
When to Choose Swarm
- You know Docker Compose and need multi-host deployment
- 2-10 servers is your target scale
- You want the simplest possible multi-host orchestration
- Your team doesn't have Kubernetes expertise and doesn't want to invest in it
HashiCorp Nomad: The Middle Ground
Nomad occupies the space between Docker Swarm and Kubernetes. It's simpler than K8s but more capable than Swarm. It's also not Docker-specific -- it can orchestrate containers, VMs, Java apps, and raw binaries.
Nomad Job Specification
# web-app.nomad.hcl
job "web-app" {
datacenters = ["dc1"]
type = "service"
group "app" {
count = 3
network {
port "http" {
to = 3000
}
}
service {
name = "web-app"
port = "http"
tags = ["urlprefix-/"]
check {
type = "http"
path = "/health"
interval = "10s"
timeout = "5s"
}
}
task "server" {
driver = "docker"
config {
image = "registry.example.com/myapp:latest"
ports = ["http"]
}
resources {
cpu = 500 # MHz
memory = 512 # MB
}
env {
NODE_ENV = "production"
DATABASE_URL = "postgresql://app:[email protected]:5432/myapp"
}
template {
# Pull secrets from Vault
data = <<EOF
{{ with secret "secret/data/myapp" }}
JWT_SECRET={{ .Data.data.jwt_secret }}
STRIPE_KEY={{ .Data.data.stripe_key }}
{{ end }}
EOF
destination = "secrets/env"
env = true
}
}
update {
max_parallel = 1
min_healthy_time = "30s"
healthy_deadline = "5m"
auto_revert = true
canary = 1
}
scaling {
enabled = true
min = 2
max = 10
policy {
# Auto-scale based on CPU
check "cpu" {
source = "nomad-apm"
query = "avg_cpu"
strategy "target-value" {
target = 70
}
}
}
}
}
}
Running Nomad
# Deploy a job
nomad job run web-app.nomad.hcl
# Check status
nomad job status web-app
# View allocations (running instances)
nomad alloc status <alloc-id>
# View logs
nomad alloc logs <alloc-id>
# Scale manually
nomad job scale web-app app 5
# Rolling update (just run with updated image)
nomad job run web-app.nomad.hcl
# Plan (dry run)
nomad job plan web-app.nomad.hcl
Nomad Strengths
- Simpler than Kubernetes -- one binary, less configuration
- Multi-workload -- containers, VMs, Java JARs, raw binaries
- HashiCorp ecosystem -- integrates with Consul (service mesh) and Vault (secrets)
- Autoscaling -- built-in, no additional controller needed
- Canary deployments -- first-class support
- Single binary -- easy to deploy, no etcd or API server dependencies
Nomad Limitations
- Smaller ecosystem than Kubernetes (fewer operators, tools, integrations)
- Less community support -- fewer Stack Overflow answers, fewer blog posts
- No built-in ingress controller -- need Consul Connect, Traefik, or similar
- HashiCorp licensing -- changed to BSL in 2023 (not fully open-source)
When to Choose Nomad
- 5-100+ servers, and you want simpler operations than K8s
- You run mixed workloads (not just containers)
- You're already in the HashiCorp ecosystem (Vault, Consul, Terraform)
- You want canary deployments and autoscaling without K8s complexity
Kubernetes: The Industry Standard
Kubernetes is the most powerful container orchestration system. It handles any scale, has a massive ecosystem, and every major cloud provider offers a managed version. The trade-off is significant complexity -- in configuration, operations, and mental overhead.
Basic Kubernetes Resources
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
labels:
app: web-app
spec:
replicas: 3
selector:
matchLabels:
app: web-app
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
app: web-app
spec:
containers:
- name: app
image: registry.example.com/myapp:v1.2.3
ports:
- containerPort: 3000
env:
- name: NODE_ENV
value: "production"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: app-secrets
key: database-url
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
readinessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 10
periodSeconds: 5
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: web-app
spec:
selector:
app: web-app
ports:
- port: 80
targetPort: 3000
type: ClusterIP
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: web-app
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
ingressClassName: nginx
tls:
- hosts:
- app.example.com
secretName: app-tls
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: web-app
port:
number: 80
# hpa.yaml -- Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-app
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Essential kubectl Commands
# View resources
kubectl get pods
kubectl get services
kubectl get deployments
kubectl get ingress
# Describe a resource (detailed info + events)
kubectl describe pod web-app-7d8f6c9b4-x2k4l
# View logs
kubectl logs web-app-7d8f6c9b4-x2k4l
kubectl logs -f web-app-7d8f6c9b4-x2k4l # Follow
kubectl logs --previous web-app-7d8f6c9b4-x2k4l # Previous crash
# Execute a command in a pod
kubectl exec -it web-app-7d8f6c9b4-x2k4l -- /bin/sh
# Apply configuration
kubectl apply -f deployment.yaml
# Scale
kubectl scale deployment web-app --replicas=5
# Rollback
kubectl rollout undo deployment web-app
kubectl rollout status deployment web-app
kubectl rollout history deployment web-app
Managed Kubernetes Services
| Service | Provider | Strengths |
|---|---|---|
| EKS | AWS | Deep AWS integration, most popular |
| GKE | Google Cloud | Best managed K8s experience, Autopilot mode |
| AKS | Azure | Good for Microsoft-heavy shops |
| DigitalOcean K8s | DigitalOcean | Simplest managed K8s, lower cost |
| Linode K8s | Akamai | Simple, affordable |
Strong recommendation: Use managed Kubernetes. Running your own control plane is a full-time job. GKE Autopilot or EKS with Fargate removes even more operational burden by managing the worker nodes.
Kubernetes Strengths
- Ecosystem: Thousands of operators, tools, and integrations
- Cloud-native standard: Every cloud provider, every CI/CD tool, every monitoring tool supports K8s
- Self-healing: Automatically restarts crashed containers, reschedules on node failure
- Advanced scheduling: Affinity, anti-affinity, taints, tolerations, topology spread
- Extensibility: Custom Resource Definitions (CRDs) extend the API for anything
Kubernetes Weaknesses
- Complexity: YAML verbosity, multiple resource types, networking model
- Learning curve: Months to become proficient, years to master
- Operational overhead: Even managed K8s requires significant expertise
- Resource consumption: The control plane and system pods consume resources
- Overkill for small workloads: Running 3 pods on a 3-node cluster is wasteful
Decision Framework
How many servers do you need?
1 server:
→ Docker Compose (done)
2-5 servers:
→ Docker Swarm (if you want simplicity)
→ Nomad (if you want more features)
5-50 servers:
→ Nomad (if you want simpler ops)
→ Managed Kubernetes (if you want ecosystem/hiring)
50+ servers:
→ Kubernetes (managed, preferably)
Special cases:
Mixed workloads (not just containers): → Nomad
Already in HashiCorp ecosystem: → Nomad
Team has K8s experience: → Kubernetes at any scale
Startup with 3 engineers: → Docker Compose until it hurts
The "Until It Hurts" Philosophy
Start with the simplest tool that works:
- Start with Docker Compose on a single server
- When you need high availability or more capacity, move to Docker Swarm or Nomad
- When you need advanced scheduling, ecosystem tooling, or your team grows, move to Kubernetes
Each migration is a step function in complexity. Don't jump to step 3 because "we might need it someday." The cost of running Kubernetes when you don't need it is real: more YAML, more debugging, more specialized knowledge required, and more things that can break.
Lightweight Kubernetes Distributions
If you do need Kubernetes but want lower overhead, these distributions are simpler to run:
| Distribution | Best For | Notable Feature |
|---|---|---|
| k3s | Edge, small clusters | Single binary, < 100MB memory |
| k0s | Production-ready minimal K8s | Zero friction install |
| MicroK8s | Developer workstations | Snap-based, easy add-ons |
| kind | CI/CD testing | Runs K8s in Docker containers |
| minikube | Local development | Multiple driver options |
# Install k3s (production-ready K8s in 30 seconds)
curl -sfL https://get.k3s.io | sh -
# Check it's running
kubectl get nodes
# k3s includes:
# - Traefik (ingress)
# - CoreDNS (service discovery)
# - Flannel (networking)
# - Local storage provisioner
Summary
Container orchestration exists on a spectrum, and the right tool depends on your scale, not your ambition. Docker Compose handles far more than most teams realize -- a single well-provisioned server with Compose can serve thousands of concurrent users. Docker Swarm and Nomad occupy the middle ground for teams that need multi-host without Kubernetes complexity. Kubernetes is the right choice at genuine scale or when you need its ecosystem, but it's the wrong choice if you're three engineers running five containers. Start simple, and add complexity only when the pain of the current tool exceeds the cost of migration.