Scalability Testing and Load Modeling

Scalability testing answers: “At what point does this system break, and why?” Performance testing tells you if your system is fast enough today. Scalability testing tells you if it will survive tomorrow’s growth. The goal is to find and fix bottlenecks before they find you — in production, during a traffic spike, at 2 AM.

Types of Scalability Tests

Load Test:
  What: Expected production traffic
  Duration: 30-60 minutes
  Purpose: Validate normal performance
  Example: 1,000 concurrent users, steady state
  
Stress Test:
  What: Beyond expected capacity
  Duration: Until failure
  Purpose: Find breaking point
  Example: Ramp from 1,000 to 10,000 users, find where latency spikes
  
Soak Test (Endurance):
  What: Expected load for extended period
  Duration: 4-24 hours
  Purpose: Find memory leaks, connection exhaustion, disk fill
  Example: 1,000 concurrent users for 12 hours straight
  
Spike Test:
  What: Sudden, extreme traffic burst
  Duration: Minutes
  Purpose: Test auto-scaling, queuing, graceful degradation
  Example: 100 → 5,000 users in 30 seconds, then back to 100
  
Breakpoint Test:
  What: Incremental increase until failure
  Duration: Variable
  Purpose: Find exact capacity limits
  Example: Add 100 users every minute, record when errors start

Load Profile Design

# k6 load test: Realistic user journey
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate, Trend } from 'k6/metrics';

const errorRate = new Rate('errors');
const responseTime = new Trend('response_time');

export const options = {
  scenarios: {
    // Breakpoint test: Find scalability limits
    breakpoint: {
      executor: 'ramping-arrival-rate',
      startRate: 10,           // 10 requests/second
      timeUnit: '1s',
      preAllocatedVUs: 500,
      maxVUs: 5000,
      stages: [
        { target: 50, duration: '2m' },    // Ramp to 50 rps
        { target: 100, duration: '2m' },   // Ramp to 100 rps
        { target: 200, duration: '2m' },   // Ramp to 200 rps
        { target: 500, duration: '2m' },   // Ramp to 500 rps
        { target: 1000, duration: '2m' },  // Ramp to 1000 rps
        { target: 2000, duration: '2m' },  // Ramp to 2000 rps
      ],
    },
  },
  thresholds: {
    'http_req_duration': ['p(95)<500', 'p(99)<1000'],
    'errors': ['rate<0.01'],
  },
};

export default function () {
  // Realistic user journey (not just hammering one endpoint)
  
  // 1. Homepage
  let res = http.get('https://app.example.com/');
  check(res, { 'homepage 200': (r) => r.status === 200 });
  responseTime.add(res.timings.duration);
  sleep(Math.random() * 3 + 1); // Realistic think time
  
  // 2. Login
  res = http.post('https://app.example.com/api/auth/login', 
    JSON.stringify({ email: 'test@example.com', password: 'test' }),
    { headers: { 'Content-Type': 'application/json' } }
  );
  check(res, { 'login 200': (r) => r.status === 200 });
  sleep(Math.random() * 2 + 1);
  
  // 3. Dashboard (typically the heaviest page)
  res = http.get('https://app.example.com/api/dashboard');
  check(res, { 'dashboard 200': (r) => r.status === 200 });
  errorRate.add(res.status !== 200);
}

Bottleneck Analysis

Common bottlenecks by layer:

Application Layer:
  ☐ Thread pool exhaustion
  ☐ Memory leaks (soak test reveals)
  ☐ Synchronous blocking calls
  ☐ N+1 query patterns
  ☐ Missing connection pooling

Database Layer:
  ☐ Connection pool exhaustion
  ☐ Lock contention
  ☐ Missing indexes
  ☐ Full table scans
  ☐ Slow queries under concurrent load

Infrastructure Layer:
  ☐ CPU saturation
  ☐ Network bandwidth limits
  ☐ Disk I/O bottleneck
  ☐ Load balancer connection limits
  ☐ Auto-scaling too slow

Anti-Patterns

Anti-Pattern	Consequence	Fix
Test only single endpoint	Miss realistic multi-step bottlenecks	Full user journey load profiles
No think time between requests	Unrealistic, overstates capacity	Randomized delays like real users
Test against production	Impact real users	Dedicated performance environment
Skip soak testing	Memory leaks ship to production	8-12 hour soak tests quarterly
Load test once before launch	System behavior changes over time	Monthly load testing in CI/CD

The best time to find a scalability bottleneck is in a test. The worst time is during Black Friday with 10x normal traffic and your entire team on Slack scrambling.

Types of Scalability Tests

Load Profile Design

Bottleneck Analysis

Anti-Patterns

More in Site Reliability Engineering

Capacity Planning: Scaling Infrastructure Before You Need To

SRE Capacity Forecasting

Capacity Planning