ESC
Type to search guides, tutorials, and reference documentation.
Verified by Garnet Grid

Load Testing: Knowing Your Breaking Point Before Production Does

Design and execute load tests that reveal performance bottlenecks, capacity limits, and failure modes before your users discover them. Covers test design, tooling, realistic workload modeling, result interpretation, and continuous load testing in CI/CD.

Load testing answers one question: what happens to your system under pressure? Not what you think happens, not what the architecture diagram says should happen, but what actually happens when 10,000 users hit your checkout flow simultaneously at 2 PM on Black Friday.

Most teams discover their breaking point in production. Load testing discovers it in a controlled environment where the consequences are a report — not an outage.


Types of Load Tests

Smoke Test

Minimal load to verify the test setup works:

Users: 1-5
Duration: 1-2 minutes
Purpose: Validate test scripts, endpoints, authentication

Load Test

Expected peak traffic to verify performance meets SLAs:

Users: Expected peak concurrent users
Duration: 30-60 minutes
Purpose: Verify latency, throughput, error rate under normal peak
Success: P99 < 500ms, error rate < 0.1%

Stress Test

Beyond expected peak to find the breaking point:

Users: 2x-5x expected peak, ramped gradually
Duration: Until failure or 60 minutes
Purpose: Find where the system degrades and how it fails

Soak Test

Sustained load over hours to find memory leaks and resource exhaustion:

Users: 70% of peak
Duration: 4-12 hours
Purpose: Memory leaks, connection exhaustion, log rotation, disk fill

Spike Test

Sudden traffic burst to test auto-scaling and circuit breakers:

Users: 0 → 10x peak → 0 in 60 seconds
Purpose: Auto-scaling response time, queue overflow, connection storms

Designing Realistic Workloads

The most common load testing mistake is testing the wrong thing. A load test that hammers /api/health at 50,000 RPS tells you nothing about your checkout flow.

Traffic Analysis

Start with production traffic patterns:

Real traffic distribution:
  GET  /api/products         35%   (browse catalog)
  GET  /api/products/:id     25%   (view product)
  POST /api/cart/items        15%   (add to cart)
  GET  /api/cart              10%   (view cart)
  POST /api/orders             8%   (checkout)
  POST /api/auth/login         5%   (login)
  Other                        2%

User Journeys

Model complete user flows, not individual endpoints:

// k6 user scenario
export default function() {
  // 1. Browse products (think time: 3-5s)
  let products = http.get(`${BASE}/api/products`);
  sleep(randomIntBetween(3, 5));
  
  // 2. View product detail
  let productId = products.json('data.0.id');
  http.get(`${BASE}/api/products/${productId}`);
  sleep(randomIntBetween(2, 4));
  
  // 3. Add to cart
  http.post(`${BASE}/api/cart/items`, JSON.stringify({
    productId: productId,
    quantity: 1
  }), { headers: { 'Content-Type': 'application/json' } });
  sleep(randomIntBetween(1, 2));
  
  // 4. Checkout (20% of users who add to cart)
  if (Math.random() < 0.2) {
    http.post(`${BASE}/api/orders`, JSON.stringify({
      paymentMethod: 'card'
    }));
  }
}

Think Time

Real users pause between actions. Without think time, your test generates 10x more traffic per virtual user than reality, which distorts results.


Tool Selection

ToolLanguageStrengthsScale
k6JavaScriptDeveloper-friendly, CI integration, cloud option100K+ VUs
LocustPythonPythonic, distributed, real-time UI50K+ VUs
GatlingScala/JavaJVM performance, detailed reports100K+ VUs
ArtilleryYAML/JSSimple config, serverless option10K+ VUs
JMeterGUI/XMLFeature-rich, legacy standard10K+ VUs

k6 Example

import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '2m', target: 100 },   // Ramp up
    { duration: '5m', target: 100 },   // Steady state
    { duration: '2m', target: 500 },   // Stress
    { duration: '5m', target: 500 },   // Sustained stress
    { duration: '2m', target: 0 },     // Ramp down
  ],
  thresholds: {
    http_req_duration: ['p(99)<1000'],  // 99% of requests under 1s
    http_req_failed: ['rate<0.01'],     // Less than 1% failure rate
  },
};

export default function() {
  const res = http.get('https://api.example.com/orders');
  check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 500ms': (r) => r.timings.duration < 500,
  });
  sleep(1);
}

Interpreting Results

Key Metrics

Throughput:    2,847 req/s (target: 2,000)     ✅
P50 latency:  45ms                             ✅
P95 latency:  180ms                            ✅
P99 latency:  890ms (target: <1000ms)          ✅ (barely)
Error rate:   0.02%                            ✅

Warning Signs

  • Latency increases linearly with load: Normal until saturation, then exponential growth indicates a bottleneck
  • P99 >> P50: High tail latency suggests queuing or contention
  • Error rate spikes at specific load: Capacity limit hit (connection pool, thread pool, database connections)
  • Throughput plateaus while latency climbs: System is saturated — adding more load makes it worse

Finding Bottlenecks

During load test, monitor:
  CPU utilization     → per service, per database
  Memory usage        → growing = possible leak
  Connection pools    → database, HTTP client, Redis
  Queue depths        → message queues, thread pools
  Disk I/O            → database write-ahead log, log files
  Network bandwidth   → cross-AZ traffic, NAT gateway

Continuous Load Testing

In CI/CD Pipeline

Run load tests on every PR that changes performance-critical code:

# GitHub Actions
- name: Load Test
  run: |
    k6 run --out json=results.json tests/load/checkout.js
    
- name: Check Thresholds
  run: |
    python scripts/check_load_results.py results.json
    # Fails if P99 > 500ms or error rate > 0.5%

Baseline Comparison

Compare every test run against a baseline:

Baseline (v2.3.0):  P99 = 340ms,  throughput = 2,847 req/s
Current  (v2.4.0):  P99 = 890ms,  throughput = 2,102 req/s
Regression:         P99 +161%,    throughput -26%    ← FAIL

Anti-Patterns

Anti-PatternConsequenceFix
Testing only happy pathsFailures under load are undiscoveredInclude error scenarios, auth failures, timeout paths
No think timeUnrealistically high request rate per userAdd realistic pauses between actions
Testing against shared stagingResults vary based on other activityDedicated load test environment
Running from a single locationTests your test machine, not your systemDistribute load generators across regions
Load testing once before launchPerformance degrades silentlyContinuous load testing in CI

Load testing is not a phase — it is a practice. Systems that are never load tested will surprise you. Systems that are continuously load tested will not.

Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →