ESC
Type to search guides, tutorials, and reference documentation.
Verified by Garnet Grid

Scalability Testing and Load Modeling

Test how systems behave under increasing load and model capacity boundaries. Covers load profile design, stress testing, soak testing, spike testing, bottleneck identification, and the patterns that reveal scalability limits before users do.

Scalability testing answers: “At what point does this system break, and why?” Performance testing tells you if your system is fast enough today. Scalability testing tells you if it will survive tomorrow’s growth. The goal is to find and fix bottlenecks before they find you — in production, during a traffic spike, at 2 AM.


Types of Scalability Tests

Load Test:
  What: Expected production traffic
  Duration: 30-60 minutes
  Purpose: Validate normal performance
  Example: 1,000 concurrent users, steady state
  
Stress Test:
  What: Beyond expected capacity
  Duration: Until failure
  Purpose: Find breaking point
  Example: Ramp from 1,000 to 10,000 users, find where latency spikes
  
Soak Test (Endurance):
  What: Expected load for extended period
  Duration: 4-24 hours
  Purpose: Find memory leaks, connection exhaustion, disk fill
  Example: 1,000 concurrent users for 12 hours straight
  
Spike Test:
  What: Sudden, extreme traffic burst
  Duration: Minutes
  Purpose: Test auto-scaling, queuing, graceful degradation
  Example: 100 → 5,000 users in 30 seconds, then back to 100
  
Breakpoint Test:
  What: Incremental increase until failure
  Duration: Variable
  Purpose: Find exact capacity limits
  Example: Add 100 users every minute, record when errors start

Load Profile Design

# k6 load test: Realistic user journey
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate, Trend } from 'k6/metrics';

const errorRate = new Rate('errors');
const responseTime = new Trend('response_time');

export const options = {
  scenarios: {
    // Breakpoint test: Find scalability limits
    breakpoint: {
      executor: 'ramping-arrival-rate',
      startRate: 10,           // 10 requests/second
      timeUnit: '1s',
      preAllocatedVUs: 500,
      maxVUs: 5000,
      stages: [
        { target: 50, duration: '2m' },    // Ramp to 50 rps
        { target: 100, duration: '2m' },   // Ramp to 100 rps
        { target: 200, duration: '2m' },   // Ramp to 200 rps
        { target: 500, duration: '2m' },   // Ramp to 500 rps
        { target: 1000, duration: '2m' },  // Ramp to 1000 rps
        { target: 2000, duration: '2m' },  // Ramp to 2000 rps
      ],
    },
  },
  thresholds: {
    'http_req_duration': ['p(95)<500', 'p(99)<1000'],
    'errors': ['rate<0.01'],
  },
};

export default function () {
  // Realistic user journey (not just hammering one endpoint)
  
  // 1. Homepage
  let res = http.get('https://app.example.com/');
  check(res, { 'homepage 200': (r) => r.status === 200 });
  responseTime.add(res.timings.duration);
  sleep(Math.random() * 3 + 1); // Realistic think time
  
  // 2. Login
  res = http.post('https://app.example.com/api/auth/login', 
    JSON.stringify({ email: 'test@example.com', password: 'test' }),
    { headers: { 'Content-Type': 'application/json' } }
  );
  check(res, { 'login 200': (r) => r.status === 200 });
  sleep(Math.random() * 2 + 1);
  
  // 3. Dashboard (typically the heaviest page)
  res = http.get('https://app.example.com/api/dashboard');
  check(res, { 'dashboard 200': (r) => r.status === 200 });
  errorRate.add(res.status !== 200);
}

Bottleneck Analysis

Common bottlenecks by layer:

Application Layer:
  ☐ Thread pool exhaustion
  ☐ Memory leaks (soak test reveals)
  ☐ Synchronous blocking calls
  ☐ N+1 query patterns
  ☐ Missing connection pooling

Database Layer:
  ☐ Connection pool exhaustion
  ☐ Lock contention
  ☐ Missing indexes
  ☐ Full table scans
  ☐ Slow queries under concurrent load

Infrastructure Layer:
  ☐ CPU saturation
  ☐ Network bandwidth limits
  ☐ Disk I/O bottleneck
  ☐ Load balancer connection limits
  ☐ Auto-scaling too slow

Anti-Patterns

Anti-PatternConsequenceFix
Test only single endpointMiss realistic multi-step bottlenecksFull user journey load profiles
No think time between requestsUnrealistic, overstates capacityRandomized delays like real users
Test against productionImpact real usersDedicated performance environment
Skip soak testingMemory leaks ship to production8-12 hour soak tests quarterly
Load test once before launchSystem behavior changes over timeMonthly load testing in CI/CD

The best time to find a scalability bottleneck is in a test. The worst time is during Black Friday with 10x normal traffic and your entire team on Slack scrambling.

Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →