Deployment Strategies
Recreate (Big Bang)
Stop everything. Deploy new version. Start everything.
v1 running
↓ stop all
↓ deploy v2
↓ start all
v2 running
Downtime: yes — gap between stop and start. Risk: high — if v2 is broken, service is down until rollback. Use case: dev environments, internal tools where downtime is acceptable.
Rolling Deployment
Replace instances one at a time (or in small batches). At any moment, old and new versions run simultaneously.
[v1][v1][v1][v1]
[v2][v1][v1][v1] ← first batch updated
[v2][v2][v1][v1] ← second batch
[v2][v2][v2][v2] ← complete
Downtime: none. Risk: medium — brief period with mixed versions in production. Requirement: API must be backwards compatible during transition.
# Kubernetes rolling update
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0 # never reduce capacity
Blue-Green Deployment
Two identical environments: blue (current) and green (new). Switch traffic from blue to green atomically.
Traffic → [Blue v1] (live) [Green v2] (idle)
↓ deploy to green
Traffic → [Blue v1] (idle) [Green v2] (new)
↓ switch load balancer
Traffic → [Green v2] (live)
Downtime: none — switch is instantaneous. Rollback: instant — redirect traffic back to blue. Cost: requires double infrastructure capacity. Use case: production deploys requiring zero-downtime and instant rollback.
Canary Deployment
Route a small percentage of traffic to the new version. Gradually increase.
100% → v1
95% → v1, 5% → v2 (canary phase)
80% → v1, 20% → v2 (expanding)
0% → v1, 100% → v2 (complete)
Risk: only a fraction of users hit the new version. Impact of bugs is limited. Requirement: good monitoring and automated rollback trigger on error rate spike.
# Kubernetes with Argo Rollouts
spec:
strategy:
canary:
steps:
- setWeight: 5
- pause: {duration: 5m}
- setWeight: 20
- pause: {duration: 10m}
- setWeight: 100
analysis:
templates:
- templateName: error-rate
args:
- name: max-error-rate
value: "1"
Feature Flags
Decouple deployment from feature release. Code ships to production with the feature disabled. Enable via flag — no deploy.
Deploy: feature code merged, flag=OFF → production (feature invisible to users)
Release: flag=ON → feature live (no deploy needed)
from feature_flags import flags
def create_order(user_id: str, items: list) -> dict:
if flags.is_enabled("new_checkout_flow", user_id=user_id):
return new_checkout_service.create(user_id, items)
return legacy_checkout_service.create(user_id, items)
Benefits: - Instant rollback: flip flag off, feature disabled in milliseconds - Gradual rollout: enable for 5% → 20% → 100% of users - A/B testing: measure metrics between flag-on and flag-off cohorts - Kill switches: disable expensive features under load
Deployment Strategy Selection Guide
| Situation | Strategy |
|---|---|
| Dev / internal tools | Recreate |
| Standard production service | Rolling |
| Zero-downtime required, instant rollback | Blue-Green |
| High risk change, need gradual exposure | Canary |
| Decouple feature release from deploy | Feature Flags |
| All of the above at scale | Canary + Feature Flags |