Deployment Strategies: Blue-Green, Canary & Rolling

The deployment strategy determines how much risk you take every time you ship code. A recreate deployment takes everything down and brings up the new version — great for dev, catastrophic for production. A canary deployment routes 1% of traffic to the new version while monitoring — safe, but slower. The right choice depends on your traffic volume, risk tolerance, and rollback requirements.

Strategy Comparison

Strategy	Downtime	Risk	Rollback Speed	Cost	Complexity
Recreate	Yes (seconds-minutes)	High	Slow (redeploy)	Lowest	Lowest
Rolling update	Zero	Medium	Medium (roll back)	Low	Low
Blue-Green	Zero	Low	Instant (switch)	2x infrastructure	Medium
Canary	Zero	Lowest	Instant (route back)	Low overhead	Medium-High
A/B Testing	Zero	Lowest	Instant	Low overhead	High

Blue-Green Deployment

BEFORE:
Load Balancer ────────▶ Blue (v1.0) ← ALL traffic
                        Green (v1.1) ← NO traffic (ready)

SWITCH:
Load Balancer ────────▶ Green (v1.1) ← ALL traffic
                        Blue (v1.0) ← NO traffic (standby)

ROLLBACK (if needed):
Load Balancer ────────▶ Blue (v1.0) ← ALL traffic (instant!)
                        Green (v1.1) ← NO traffic

Canary Deployment

Step 1:  [v1.0 ████████████████████] 100%
         [v1.1 ]                       0%

Step 2:  [v1.0 ██████████████████ ]  95%
         [v1.1 █]                      5%  ← Monitor metrics

Step 3:  [v1.0 ██████████████    ]   75%
         [v1.1 █████]                 25%  ← Still healthy

Step 4:  [v1.0 ██████████        ]   50%
         [v1.1 ██████████]           50%  ← Metrics stable

Step 5:  [v1.0 ]                      0%
         [v1.1 ████████████████████] 100%  ← Full rollout

Kubernetes Canary with Argo Rollouts

apiVersion: argoproj.io/v1alpha1
kind: Rollout
spec:
  strategy:
    canary:
      steps:
        - setWeight: 5
        - pause: {duration: 10m}
        - setWeight: 25
        - pause: {duration: 10m}
        - setWeight: 50
        - pause: {duration: 10m}
        - setWeight: 100
      
      analysis:
        templates:
          - templateName: success-rate
        startingStep: 1
        args:
          - name: service-name
            value: order-service

Decision Framework

How critical is zero-downtime?
├── Not critical (internal tools) → Recreate
└── Critical → How fast do you need rollback?
    ├── Instant → Blue-Green (if budget allows 2x infra)
    └── Fast (< 5 min) → How granular is your risk management?
        ├── Per-user targeting → A/B Testing
        ├── Percentage-based → Canary
        └── Instance-based → Rolling Update

Anti-Patterns

Anti-Pattern	Problem	Fix
Recreate in production	Downtime on every deploy	Rolling, blue-green, or canary
Canary without metrics	Deploying to 5% but not checking if it’s healthy	Automated analysis gates
Blue-green without testing green	Switch to untested environment	Smoke tests on green before switching
No rollback plan	If deploy fails, manually fix forward	Pre-defined rollback trigger and process
Manual deployment scripts	Human error, inconsistent	CI/CD pipeline with automated strategy

Checklist

Deployment strategy selected based on risk tolerance
Zero-downtime deployment for all production services
Automated rollback: trigger conditions defined
Health checks: readiness probe gates deployment
Metrics monitoring during rollout (error rate, latency)
Canary analysis: automated pass/fail gates
Database migrations: backward compatible (no breaking changes)
Deployment frequency: minimum weekly, target daily

:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For deployment strategy consulting, visit garnetgrid.com. :::

Strategy Comparison

Blue-Green Deployment

Canary Deployment

Kubernetes Canary with Argo Rollouts

Decision Framework

Anti-Patterns

Checklist

More in DevOps & CI/CD

Chaos Engineering in Practice

Canary Deployments

CI/CD Pipeline Maturity Model