Load Testing Architecture
Design and execute load tests that accurately predict production behavior. Covers load testing tools, test scenario design, realistic traffic patterns, performance baselines, bottleneck identification, and the patterns that prevent performance surprises at scale.
A system that passes functional tests can still collapse under load. Load testing reveals the breaking point: where latency spikes, where errors start, where the database locks, where the queue backs up. Without load testing, your first real load test is production traffic on launch day.
Load Testing Types
Smoke Test:
Purpose: Verify system handles minimal load
Load: 1-5 virtual users (VUs)
Duration: 1-2 minutes
When: After every deployment
Load Test:
Purpose: Validate performance at expected traffic
Load: Expected peak traffic level
Duration: 15-60 minutes
When: Before major releases
Stress Test:
Purpose: Find the breaking point
Load: Gradually increase beyond capacity
Duration: Until system degrades
When: Quarterly capacity planning
Spike Test:
Purpose: Test sudden traffic bursts
Load: Sudden 10x traffic spike
Duration: Spike for 5 minutes then return to normal
When: Before marketing campaigns, product launches
Soak Test (Endurance):
Purpose: Find memory leaks, connection exhaustion
Load: Normal traffic level
Duration: 4-24 hours
When: After major architectural changes
Implementation with k6
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate, Trend } from 'k6/metrics';
// Custom metrics
const errorRate = new Rate('errors');
const orderLatency = new Trend('order_latency');
// Test configuration
export const options = {
stages: [
{ duration: '2m', target: 50 }, // Ramp up
{ duration: '5m', target: 50 }, // Steady state
{ duration: '2m', target: 200 }, // Peak traffic
{ duration: '5m', target: 200 }, // Sustained peak
{ duration: '2m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<500', 'p(99)<1000'], // Latency SLOs
errors: ['rate<0.01'], // Error rate < 1%
http_req_failed: ['rate<0.01'], // HTTP failures < 1%
},
};
// Realistic user scenario
export default function () {
// 1. Browse homepage
const home = http.get('https://api.example.com/');
check(home, { 'homepage 200': (r) => r.status === 200 });
sleep(Math.random() * 3 + 1); // Think time: 1-4 seconds
// 2. Search products
const search = http.get('https://api.example.com/products?q=widget');
check(search, { 'search 200': (r) => r.status === 200 });
sleep(Math.random() * 2 + 1);
// 3. View product detail
const product = http.get('https://api.example.com/products/123');
check(product, { 'product 200': (r) => r.status === 200 });
sleep(Math.random() * 2 + 1);
// 4. Place order (10% of users)
if (Math.random() < 0.1) {
const start = Date.now();
const order = http.post('https://api.example.com/orders',
JSON.stringify({ product_id: '123', quantity: 1 }),
{ headers: { 'Content-Type': 'application/json' } }
);
orderLatency.add(Date.now() - start);
errorRate.add(order.status !== 201);
check(order, { 'order created': (r) => r.status === 201 });
}
}
Results Analysis
Key metrics to monitor during load test:
Application Layer:
☐ Request latency (P50, P95, P99)
☐ Error rate (4xx, 5xx)
☐ Throughput (requests per second)
☐ Apdex score
Infrastructure Layer:
☐ CPU utilization per service
☐ Memory usage and GC pressure
☐ Network I/O and connection count
☐ Disk I/O (for databases)
Database Layer:
☐ Query latency (P95)
☐ Connection pool utilization
☐ Lock contention
☐ Replication lag
Finding bottlenecks:
1. If CPU maxed → Need more compute or optimize code
2. If memory maxed → Memory leak or need larger instances
3. If DB connections maxed → Connection pooling or read replicas
4. If network saturated → CDN, compression, or bandwidth upgrade
5. If disk I/O → SSD upgrade, query optimization, caching
Anti-Patterns
| Anti-Pattern | Consequence | Fix |
|---|---|---|
| Test from same data center | Network latency not realistic | Test from multiple geographic regions |
| No think time between requests | 100 VUs behave like 10,000 | Add realistic delays between actions |
| Single endpoint only | Miss interaction effects | Full user journey scenarios |
| Test against production database | Load test corrupts real data | Isolated test environment with production-like data |
| Run once, ship | Performance degrades over time | Load tests in CI/CD on every release |
Load testing is not about finding the maximum number. It is about understanding the relationship between load, latency, and errors — so you can plan capacity, set SLOs, and sleep soundly before launch day.