ESC
Type to search guides, tutorials, and reference documentation.
Verified by Garnet Grid

Load Testing Architecture

Design and execute load tests that accurately predict production behavior. Covers load testing tools, test scenario design, realistic traffic patterns, performance baselines, bottleneck identification, and the patterns that prevent performance surprises at scale.

A system that passes functional tests can still collapse under load. Load testing reveals the breaking point: where latency spikes, where errors start, where the database locks, where the queue backs up. Without load testing, your first real load test is production traffic on launch day.


Load Testing Types

Smoke Test:
  Purpose: Verify system handles minimal load
  Load: 1-5 virtual users (VUs)
  Duration: 1-2 minutes
  When: After every deployment

Load Test:
  Purpose: Validate performance at expected traffic
  Load: Expected peak traffic level
  Duration: 15-60 minutes
  When: Before major releases

Stress Test:
  Purpose: Find the breaking point
  Load: Gradually increase beyond capacity
  Duration: Until system degrades
  When: Quarterly capacity planning

Spike Test:
  Purpose: Test sudden traffic bursts
  Load: Sudden 10x traffic spike
  Duration: Spike for 5 minutes then return to normal
  When: Before marketing campaigns, product launches

Soak Test (Endurance):
  Purpose: Find memory leaks, connection exhaustion
  Load: Normal traffic level
  Duration: 4-24 hours
  When: After major architectural changes

Implementation with k6

import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate, Trend } from 'k6/metrics';

// Custom metrics
const errorRate = new Rate('errors');
const orderLatency = new Trend('order_latency');

// Test configuration
export const options = {
  stages: [
    { duration: '2m', target: 50 },   // Ramp up
    { duration: '5m', target: 50 },   // Steady state
    { duration: '2m', target: 200 },  // Peak traffic
    { duration: '5m', target: 200 },  // Sustained peak
    { duration: '2m', target: 0 },    // Ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<500', 'p(99)<1000'],  // Latency SLOs
    errors: ['rate<0.01'],                             // Error rate < 1%
    http_req_failed: ['rate<0.01'],                    // HTTP failures < 1%
  },
};

// Realistic user scenario
export default function () {
  // 1. Browse homepage
  const home = http.get('https://api.example.com/');
  check(home, { 'homepage 200': (r) => r.status === 200 });
  
  sleep(Math.random() * 3 + 1); // Think time: 1-4 seconds

  // 2. Search products
  const search = http.get('https://api.example.com/products?q=widget');
  check(search, { 'search 200': (r) => r.status === 200 });
  
  sleep(Math.random() * 2 + 1);

  // 3. View product detail
  const product = http.get('https://api.example.com/products/123');
  check(product, { 'product 200': (r) => r.status === 200 });
  
  sleep(Math.random() * 2 + 1);

  // 4. Place order (10% of users)
  if (Math.random() < 0.1) {
    const start = Date.now();
    const order = http.post('https://api.example.com/orders', 
      JSON.stringify({ product_id: '123', quantity: 1 }),
      { headers: { 'Content-Type': 'application/json' } }
    );
    
    orderLatency.add(Date.now() - start);
    errorRate.add(order.status !== 201);
    
    check(order, { 'order created': (r) => r.status === 201 });
  }
}

Results Analysis

Key metrics to monitor during load test:

Application Layer:
  ☐ Request latency (P50, P95, P99)
  ☐ Error rate (4xx, 5xx)
  ☐ Throughput (requests per second)
  ☐ Apdex score

Infrastructure Layer:
  ☐ CPU utilization per service
  ☐ Memory usage and GC pressure
  ☐ Network I/O and connection count
  ☐ Disk I/O (for databases)

Database Layer:
  ☐ Query latency (P95)
  ☐ Connection pool utilization
  ☐ Lock contention
  ☐ Replication lag

Finding bottlenecks:
  1. If CPU maxed → Need more compute or optimize code
  2. If memory maxed → Memory leak or need larger instances
  3. If DB connections maxed → Connection pooling or read replicas
  4. If network saturated → CDN, compression, or bandwidth upgrade
  5. If disk I/O → SSD upgrade, query optimization, caching

Anti-Patterns

Anti-PatternConsequenceFix
Test from same data centerNetwork latency not realisticTest from multiple geographic regions
No think time between requests100 VUs behave like 10,000Add realistic delays between actions
Single endpoint onlyMiss interaction effectsFull user journey scenarios
Test against production databaseLoad test corrupts real dataIsolated test environment with production-like data
Run once, shipPerformance degrades over timeLoad tests in CI/CD on every release

Load testing is not about finding the maximum number. It is about understanding the relationship between load, latency, and errors — so you can plan capacity, set SLOs, and sleep soundly before launch day.

Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →