Scalability Testing and Load Modeling
Test how systems behave under increasing load and model capacity boundaries. Covers load profile design, stress testing, soak testing, spike testing, bottleneck identification, and the patterns that reveal scalability limits before users do.
Scalability testing answers: “At what point does this system break, and why?” Performance testing tells you if your system is fast enough today. Scalability testing tells you if it will survive tomorrow’s growth. The goal is to find and fix bottlenecks before they find you — in production, during a traffic spike, at 2 AM.
Types of Scalability Tests
Load Test:
What: Expected production traffic
Duration: 30-60 minutes
Purpose: Validate normal performance
Example: 1,000 concurrent users, steady state
Stress Test:
What: Beyond expected capacity
Duration: Until failure
Purpose: Find breaking point
Example: Ramp from 1,000 to 10,000 users, find where latency spikes
Soak Test (Endurance):
What: Expected load for extended period
Duration: 4-24 hours
Purpose: Find memory leaks, connection exhaustion, disk fill
Example: 1,000 concurrent users for 12 hours straight
Spike Test:
What: Sudden, extreme traffic burst
Duration: Minutes
Purpose: Test auto-scaling, queuing, graceful degradation
Example: 100 → 5,000 users in 30 seconds, then back to 100
Breakpoint Test:
What: Incremental increase until failure
Duration: Variable
Purpose: Find exact capacity limits
Example: Add 100 users every minute, record when errors start
Load Profile Design
# k6 load test: Realistic user journey
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate, Trend } from 'k6/metrics';
const errorRate = new Rate('errors');
const responseTime = new Trend('response_time');
export const options = {
scenarios: {
// Breakpoint test: Find scalability limits
breakpoint: {
executor: 'ramping-arrival-rate',
startRate: 10, // 10 requests/second
timeUnit: '1s',
preAllocatedVUs: 500,
maxVUs: 5000,
stages: [
{ target: 50, duration: '2m' }, // Ramp to 50 rps
{ target: 100, duration: '2m' }, // Ramp to 100 rps
{ target: 200, duration: '2m' }, // Ramp to 200 rps
{ target: 500, duration: '2m' }, // Ramp to 500 rps
{ target: 1000, duration: '2m' }, // Ramp to 1000 rps
{ target: 2000, duration: '2m' }, // Ramp to 2000 rps
],
},
},
thresholds: {
'http_req_duration': ['p(95)<500', 'p(99)<1000'],
'errors': ['rate<0.01'],
},
};
export default function () {
// Realistic user journey (not just hammering one endpoint)
// 1. Homepage
let res = http.get('https://app.example.com/');
check(res, { 'homepage 200': (r) => r.status === 200 });
responseTime.add(res.timings.duration);
sleep(Math.random() * 3 + 1); // Realistic think time
// 2. Login
res = http.post('https://app.example.com/api/auth/login',
JSON.stringify({ email: 'test@example.com', password: 'test' }),
{ headers: { 'Content-Type': 'application/json' } }
);
check(res, { 'login 200': (r) => r.status === 200 });
sleep(Math.random() * 2 + 1);
// 3. Dashboard (typically the heaviest page)
res = http.get('https://app.example.com/api/dashboard');
check(res, { 'dashboard 200': (r) => r.status === 200 });
errorRate.add(res.status !== 200);
}
Bottleneck Analysis
Common bottlenecks by layer:
Application Layer:
☐ Thread pool exhaustion
☐ Memory leaks (soak test reveals)
☐ Synchronous blocking calls
☐ N+1 query patterns
☐ Missing connection pooling
Database Layer:
☐ Connection pool exhaustion
☐ Lock contention
☐ Missing indexes
☐ Full table scans
☐ Slow queries under concurrent load
Infrastructure Layer:
☐ CPU saturation
☐ Network bandwidth limits
☐ Disk I/O bottleneck
☐ Load balancer connection limits
☐ Auto-scaling too slow
Anti-Patterns
| Anti-Pattern | Consequence | Fix |
|---|---|---|
| Test only single endpoint | Miss realistic multi-step bottlenecks | Full user journey load profiles |
| No think time between requests | Unrealistic, overstates capacity | Randomized delays like real users |
| Test against production | Impact real users | Dedicated performance environment |
| Skip soak testing | Memory leaks ship to production | 8-12 hour soak tests quarterly |
| Load test once before launch | System behavior changes over time | Monthly load testing in CI/CD |
The best time to find a scalability bottleneck is in a test. The worst time is during Black Friday with 10x normal traffic and your entire team on Slack scrambling.