Load Testing: Knowing Your Breaking Point Before Production Does
Design and execute load tests that reveal performance bottlenecks, capacity limits, and failure modes before your users discover them. Covers test design, tooling, realistic workload modeling, result interpretation, and continuous load testing in CI/CD.
Load testing answers one question: what happens to your system under pressure? Not what you think happens, not what the architecture diagram says should happen, but what actually happens when 10,000 users hit your checkout flow simultaneously at 2 PM on Black Friday.
Most teams discover their breaking point in production. Load testing discovers it in a controlled environment where the consequences are a report — not an outage.
Types of Load Tests
Smoke Test
Minimal load to verify the test setup works:
Users: 1-5
Duration: 1-2 minutes
Purpose: Validate test scripts, endpoints, authentication
Load Test
Expected peak traffic to verify performance meets SLAs:
Users: Expected peak concurrent users
Duration: 30-60 minutes
Purpose: Verify latency, throughput, error rate under normal peak
Success: P99 < 500ms, error rate < 0.1%
Stress Test
Beyond expected peak to find the breaking point:
Users: 2x-5x expected peak, ramped gradually
Duration: Until failure or 60 minutes
Purpose: Find where the system degrades and how it fails
Soak Test
Sustained load over hours to find memory leaks and resource exhaustion:
Users: 70% of peak
Duration: 4-12 hours
Purpose: Memory leaks, connection exhaustion, log rotation, disk fill
Spike Test
Sudden traffic burst to test auto-scaling and circuit breakers:
Users: 0 → 10x peak → 0 in 60 seconds
Purpose: Auto-scaling response time, queue overflow, connection storms
Designing Realistic Workloads
The most common load testing mistake is testing the wrong thing. A load test that hammers /api/health at 50,000 RPS tells you nothing about your checkout flow.
Traffic Analysis
Start with production traffic patterns:
Real traffic distribution:
GET /api/products 35% (browse catalog)
GET /api/products/:id 25% (view product)
POST /api/cart/items 15% (add to cart)
GET /api/cart 10% (view cart)
POST /api/orders 8% (checkout)
POST /api/auth/login 5% (login)
Other 2%
User Journeys
Model complete user flows, not individual endpoints:
// k6 user scenario
export default function() {
// 1. Browse products (think time: 3-5s)
let products = http.get(`${BASE}/api/products`);
sleep(randomIntBetween(3, 5));
// 2. View product detail
let productId = products.json('data.0.id');
http.get(`${BASE}/api/products/${productId}`);
sleep(randomIntBetween(2, 4));
// 3. Add to cart
http.post(`${BASE}/api/cart/items`, JSON.stringify({
productId: productId,
quantity: 1
}), { headers: { 'Content-Type': 'application/json' } });
sleep(randomIntBetween(1, 2));
// 4. Checkout (20% of users who add to cart)
if (Math.random() < 0.2) {
http.post(`${BASE}/api/orders`, JSON.stringify({
paymentMethod: 'card'
}));
}
}
Think Time
Real users pause between actions. Without think time, your test generates 10x more traffic per virtual user than reality, which distorts results.
Tool Selection
| Tool | Language | Strengths | Scale |
|---|---|---|---|
| k6 | JavaScript | Developer-friendly, CI integration, cloud option | 100K+ VUs |
| Locust | Python | Pythonic, distributed, real-time UI | 50K+ VUs |
| Gatling | Scala/Java | JVM performance, detailed reports | 100K+ VUs |
| Artillery | YAML/JS | Simple config, serverless option | 10K+ VUs |
| JMeter | GUI/XML | Feature-rich, legacy standard | 10K+ VUs |
k6 Example
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
stages: [
{ duration: '2m', target: 100 }, // Ramp up
{ duration: '5m', target: 100 }, // Steady state
{ duration: '2m', target: 500 }, // Stress
{ duration: '5m', target: 500 }, // Sustained stress
{ duration: '2m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(99)<1000'], // 99% of requests under 1s
http_req_failed: ['rate<0.01'], // Less than 1% failure rate
},
};
export default function() {
const res = http.get('https://api.example.com/orders');
check(res, {
'status is 200': (r) => r.status === 200,
'response time < 500ms': (r) => r.timings.duration < 500,
});
sleep(1);
}
Interpreting Results
Key Metrics
Throughput: 2,847 req/s (target: 2,000) ✅
P50 latency: 45ms ✅
P95 latency: 180ms ✅
P99 latency: 890ms (target: <1000ms) ✅ (barely)
Error rate: 0.02% ✅
Warning Signs
- Latency increases linearly with load: Normal until saturation, then exponential growth indicates a bottleneck
- P99 >> P50: High tail latency suggests queuing or contention
- Error rate spikes at specific load: Capacity limit hit (connection pool, thread pool, database connections)
- Throughput plateaus while latency climbs: System is saturated — adding more load makes it worse
Finding Bottlenecks
During load test, monitor:
CPU utilization → per service, per database
Memory usage → growing = possible leak
Connection pools → database, HTTP client, Redis
Queue depths → message queues, thread pools
Disk I/O → database write-ahead log, log files
Network bandwidth → cross-AZ traffic, NAT gateway
Continuous Load Testing
In CI/CD Pipeline
Run load tests on every PR that changes performance-critical code:
# GitHub Actions
- name: Load Test
run: |
k6 run --out json=results.json tests/load/checkout.js
- name: Check Thresholds
run: |
python scripts/check_load_results.py results.json
# Fails if P99 > 500ms or error rate > 0.5%
Baseline Comparison
Compare every test run against a baseline:
Baseline (v2.3.0): P99 = 340ms, throughput = 2,847 req/s
Current (v2.4.0): P99 = 890ms, throughput = 2,102 req/s
Regression: P99 +161%, throughput -26% ← FAIL
Anti-Patterns
| Anti-Pattern | Consequence | Fix |
|---|---|---|
| Testing only happy paths | Failures under load are undiscovered | Include error scenarios, auth failures, timeout paths |
| No think time | Unrealistically high request rate per user | Add realistic pauses between actions |
| Testing against shared staging | Results vary based on other activity | Dedicated load test environment |
| Running from a single location | Tests your test machine, not your system | Distribute load generators across regions |
| Load testing once before launch | Performance degrades silently | Continuous load testing in CI |
Load testing is not a phase — it is a practice. Systems that are never load tested will surprise you. Systems that are continuously load tested will not.