Feature Flags at Scale: Decoupling Deployment from Release
Implement feature flags that let you ship code to production without exposing it to users until you are ready. Covers flag lifecycle, targeting strategies, percentage rollouts, kill switches, technical debt management, and the operational practices that prevent flag sprawl.
The most powerful deployment pattern is also the simplest concept: deploy your code to production, but hide it behind a flag. If the feature works, turn the flag on. If it breaks, turn the flag off. No rollback, no hotfix, no 2 AM deployment. Just flip a switch.
Feature flags decouple deployment (putting code on servers) from release (making features visible to users). This distinction changes how your team thinks about risk: every deployment becomes boring because the risk is managed by the flag, not the deployment.
Flag Types
| Type | Lifespan | Purpose | Example |
|---|---|---|---|
| Release flag | Days to weeks | Ship incomplete features safely | new_checkout_flow |
| Experiment flag | Weeks | A/B testing | experiment_pricing_page_v2 |
| Ops flag | Permanent | Kill switches, circuit breakers | enable_recommendation_engine |
| Permission flag | Permanent | Feature entitlements, tiers | premium_analytics_dashboard |
Flag Lifecycle
Created → Developing → Testing → Rolling Out → Fully On → Removed
│ │ │ │ │ │
│ Code behind │ QA uses │ 1% → 10% │ Flag removed │ Code lives │
│ flag, off │ flag to │ → 50% │ from code, │ without │
│ by default │ test both │ → 100% │ flag deleted │ flag │
│ │ states │ of users │ from system │ │
The most important phase is “Removed.” Every flag that stays in the code after it is fully rolled out is technical debt. If you launch 50 features a year and never remove flags, you have 50 dead flags creating branching complexity throughout your codebase.
Implementation Patterns
Basic Flag Check
# Simple boolean flag
def get_recommendations(user_id: str):
if feature_flags.is_enabled("ml_recommendations", user_id=user_id):
return recommendation_engine.get_ml_recommendations(user_id)
else:
return recommendation_engine.get_rule_based_recommendations(user_id)
Percentage Rollout
# Gradual rollout: 10% of users → 50% → 100%
flag_config = {
"name": "new_checkout_flow",
"rollout_percentage": 10, # 10% of users see new flow
"targeting": {
"include_users": ["internal-team@company.com"], # Always on for team
"exclude_users": ["vip-customer-123"], # Never for VIP during testing
},
"kill_switch": True, # Can be turned off instantly
}
Targeting Rules
# Complex targeting: different behavior for different segments
flag_rules = {
"name": "premium_analytics",
"rules": [
{
"description": "Internal team always gets the feature",
"conditions": [{"attribute": "email", "operator": "ends_with", "value": "@company.com"}],
"serve": True
},
{
"description": "Enterprise tier customers",
"conditions": [{"attribute": "plan", "operator": "equals", "value": "enterprise"}],
"serve": True
},
{
"description": "10% of pro users for testing",
"conditions": [{"attribute": "plan", "operator": "equals", "value": "pro"}],
"rollout_percentage": 10,
"serve": True
}
],
"default": False
}
Operational Practices
Flag Hygiene
| Practice | Frequency | Purpose |
|---|---|---|
| Flag audit | Monthly | Identify stale flags (> 30 days old, 100% rolled out) |
| Flag owner | Always | Every flag has a named owner who is responsible for removal |
| Expiration dates | On creation | Set expected removal date when creating the flag |
| Flag count limit | Always | Alert when total active flags > 50 (team) or 200 (org) |
Flag status dashboard:
Active flags: 45
├─ Release flags: 12 (3 past expected removal date ⚠️)
├─ Experiment flags: 8 (2 experiments concluded, flags remain ⚠️)
├─ Ops flags: 15 (permanent, reviewed quarterly)
└─ Permission flags: 10 (permanent, tied to billing tiers)
Flags at 100% rollout (should be removed): 5 ❌
Flags older than 90 days: 8 (3 have valid reasons, 5 need cleanup)
Kill Switch Pattern
# Every feature that calls an external service should have a kill switch
class RecommendationService:
def get_recommendations(self, user_id: str) -> list:
# Kill switch: if the ML service is causing issues, turn it off
if not feature_flags.is_enabled("enable_ml_recommendations"):
return self.get_fallback_recommendations(user_id)
try:
return self.ml_client.recommend(user_id, timeout=2.0)
except (Timeout, ServiceError):
# Circuit breaker: auto-disable after repeated failures
self.record_failure()
if self.failure_count > self.threshold:
feature_flags.disable("enable_ml_recommendations")
self.alert_team("ML recommendations auto-disabled after failures")
return self.get_fallback_recommendations(user_id)
Tools
| Tool | Type | Best For |
|---|---|---|
| LaunchDarkly | SaaS | Enterprise, rich targeting, experiments |
| Unleash | Open source (self-hosted) | Full control, no vendor lock-in |
| Flagsmith | Open source + SaaS | Feature flags + remote config |
| Split | SaaS | Feature flags + experimentation |
| Custom (database + cache) | Custom | Simple needs, small teams |
Anti-Patterns
| Anti-Pattern | Problem | Fix |
|---|---|---|
| Flag rot | Hundreds of dead flags in code | Monthly audits, expiration dates, flag count alerts |
| Nested flags | if flagA && flagB && !flagC | Limit flag nesting to 1 level |
| Testing only one state | Tests only run with flag on | Test both states in CI |
| No fallback | Flag off = feature crashes | Every flag has a working fallback state |
| Long-lived release flags | ”We’ll remove it later” (never happens) | Block PR merges for flags past expiration |
Implementation Checklist
- Choose a feature flag system (LaunchDarkly, Unleash, or custom)
- Define flag types: release, experiment, ops, permission
- Set expiration dates on all release and experiment flags at creation time
- Assign an owner to every flag (who will remove it)
- Implement kill switches for all external service integrations
- Test both flag states (on and off) in CI
- Run monthly flag audits: identify and remove stale flags
- Alert when total flags exceed 50 per team
- Track flags at 100% rollout — these should be removed within 2 weeks
- Document flag removal process: remove code, remove flag, verify tests pass