Performance Testing at Scale

Performance testing answers the question: “Can our system handle the load we expect — and what happens when it cannot?” The alternative to performance testing is production incidents that answer the same question more painfully.

Types of Performance Testing

Load Testing:
  Expected production load → sustained for duration
  "Can we handle 1,000 concurrent users for 1 hour?"

Stress Testing:
  Gradually increase beyond capacity → find breaking point
  "At what point do we start failing?"

Spike Testing:
  Sudden surge of traffic → measure recovery
  "What happens during a flash sale?"

Soak Testing:
  Expected load → sustained for extended period (24-72 hours)
  "Do we have memory leaks or resource exhaustion?"

Breakpoint Testing:
  Systematically increase load → find exact threshold
  "How many concurrent users until p95 latency > 500ms?"

k6 Load Testing

import http from 'k6/http';
import { check, sleep } from 'k6';
import { htmlReport } from 'https://raw.githubusercontent.com/benc-uk/k6-reporter/main/dist/bundle.js';

export const options = {
  stages: [
    { duration: '2m', target: 100 },   // Ramp up
    { duration: '5m', target: 100 },   // Steady state
    { duration: '2m', target: 500 },   // Stress
    { duration: '5m', target: 500 },   // Sustained stress
    { duration: '2m', target: 0 },     // Ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<500', 'p(99)<1000'],  // Latency
    http_req_failed: ['rate<0.01'],                    // Error rate < 1%
    http_reqs: ['rate>100'],                           // Throughput > 100 rps
  },
};

export default function () {
  // Simulate realistic user flow
  const loginRes = http.post('https://api.example.com/login', JSON.stringify({
    email: `user${__VU}@example.com`,
    password: 'test123'
  }), { headers: { 'Content-Type': 'application/json' } });
  
  check(loginRes, {
    'login status 200': (r) => r.status === 200,
    'login latency < 500ms': (r) => r.timings.duration < 500,
  });
  
  const token = loginRes.json().token;
  const authHeaders = { 
    headers: { 
      'Authorization': `Bearer ${token}`,
      'Content-Type': 'application/json'
    }
  };
  
  // Browse orders
  const ordersRes = http.get('https://api.example.com/orders', authHeaders);
  check(ordersRes, {
    'orders status 200': (r) => r.status === 200,
    'orders latency < 300ms': (r) => r.timings.duration < 300,
  });
  
  sleep(1); // Think time between actions
  
  // Create order
  const orderRes = http.post('https://api.example.com/orders', JSON.stringify({
    items: [{ product_id: 'prod_1', quantity: 1 }]
  }), authHeaders);
  
  check(orderRes, {
    'order created': (r) => r.status === 201,
    'order latency < 500ms': (r) => r.timings.duration < 500,
  });
  
  sleep(Math.random() * 3); // Random think time
}

export function handleSummary(data) {
  return { 'report.html': htmlReport(data) };
}

Performance Baselines

baseline_metrics:
  api_latency:
    p50: 50ms      # Median response time
    p95: 200ms     # 95th percentile
    p99: 500ms     # 99th percentile
  
  throughput:
    steady_state: 500 rps
    peak: 2000 rps
    
  error_rate:
    target: < 0.1%
    alert: > 1%
    
  resource_utilization:
    cpu: < 70% at steady state
    memory: < 80% at steady state
    database_connections: < 70% of pool
    
  degradation_thresholds:
    acceptable: "p95 increases < 50% under 2x load"
    concerning: "p95 increases > 100% under 2x load"
    critical: "errors appear under normal load"

CI/CD Integration

# Performance gate in deployment pipeline
performance_test:
  stage: pre-production
  script:
    - k6 run --out json=results.json performance/load-test.js
    
  thresholds:
    - "p95 latency < baseline * 1.1"    # No more than 10% regression
    - "error rate < 0.1%"
    - "throughput >= baseline * 0.95"    # No more than 5% throughput loss
    
  on_failure: block_deployment
  
  schedule:
    - on_merge: smoke test (1 minute, 10 users)
    - nightly: full load test (30 minutes, production load)
    - weekly: soak test (24 hours, production load)

Anti-Patterns

Anti-Pattern	Consequence	Fix
Testing against dev environment	Results not representative	Test against production-like infrastructure
No think time between requests	Unrealistic thundering herd	Add realistic think time and user flows
Testing with 1 user type	Miss slow paths	Test all critical user flows
No baseline comparison	Cannot detect regressions	Establish and track baselines
Performance testing only before launch	Regressions undetected	Continuous performance testing in CI

Performance testing is not a one-time activity. It is a continuous practice that catches regressions before they reach production and validates that your system can handle the load it was designed for.

Types of Performance Testing

k6 Load Testing

Performance Baselines

CI/CD Integration

Anti-Patterns

More in Testing & QA

Accessibility Testing

API Testing Strategy

Chaos Testing Playbook