Load Testing

Prerequisites

Docker running (for test job containers)
Go 1.26+
8GB+ RAM free
The Strait source code

Quick Start (15 minutes)

Build the test job images and run a quick validation:

# Build test job images
cd packages/load-tests && make build && cd ..

# Run quick validation (finds approximate throughput ceiling)
cd apps/strait
LOADTEST_QUICK=true go test -tags=loadtest -run TestQuickValidation \
  -timeout 15m ./internal/loadtest/...

Test Job Images

The framework includes real workloads in Python, TypeScript, and Go:

Image	Language	What It Does
`strait-loadtest-python`	Python 3.12	Fast processing, CPU-intensive work, AI agent simulation
`strait-loadtest-ts`	TypeScript (Node 22)	Data pipeline with 10K record transform
`strait-loadtest-go`	Go 1.26	Memory allocation for OOM testing
`strait-loadtest-errors`	Python 3.12	12 failure scenarios (OOM, segfault, infinite loop, etc.)

Build all images:

cd packages/load-tests && make build

Full Test Suite

Tier 1: Throughput Ceiling

Finds the maximum sustained jobs/sec. Starts at 10 jobs/sec, increases by 10 every 60 seconds until the system breaks.

go test -tags=loadtest -run TestThroughputCeiling -timeout 2h ./internal/loadtest/...

Stop conditions: queue depth > 10K, P99 latency > 5s, or error rate > 1%.

Tier 2: Concurrency Ceiling

Finds the maximum concurrent connections. Starts at 50 concurrent, increases by 50 every 2 minutes.

go test -tags=loadtest -run TestConcurrencyCeiling -timeout 1h ./internal/loadtest/...

Tier 3: Multi-Tenant Simulation

Simulates real production with hundreds of tenants, mixed plans, and varied traffic patterns.

# 500 tenants, 4 hours
go test -tags=loadtest -run TestProductionSimulation -timeout 6h ./internal/loadtest/...

# 2,000 tenants, 8 hours
LOADTEST_TENANTS=2000 LOADTEST_DURATION=8h \
  go test -tags=loadtest -run TestProductionSimulation -timeout 10h ./internal/loadtest/...

Tier 3: Breaking Point

Adds 100 tenants every 30 minutes until performance degrades.

go test -tags=loadtest -run TestBreakingPoint -timeout 12h ./internal/loadtest/...

Tier 4: Endurance (24 hours)

Runs at 70% of throughput ceiling for 24 hours. Detects memory leaks, goroutine leaks, and performance drift.

LOADTEST_DURATION=24h go test -tags=loadtest -run TestEndurance -timeout 26h ./internal/loadtest/...

Tier 5: Chaos Engineering

Breaks things on purpose during production load. 8 scenarios: worker kill, database failover, Redis failure, Docker restart, connection pool exhaustion, disk pressure, clock skew, cascading failure.

go test -tags=loadtest -run TestChaosAll -timeout 4h ./internal/loadtest/...

Error Scenarios

Tests all 12 failure modes: clean exit, exit codes, OOM, segfault, infinite loop, slow death, checkpoint recovery, SDK timeout, fork bomb, disk fill, network abuse.

go test -tags=loadtest -run TestErrorScenarios -timeout 1h ./internal/loadtest/...

Generating Reports

After running tests, generate HTML and JSON reports:

go run -tags=loadtest ./internal/loadtest/cmd/report \
  -input loadtest-results/latest/ \
  -html report.html -json report.json

The HTML report includes:

Executive summary with key metrics
Throughput and concurrency ramp tables
Multi-tenant simulation results
Chaos engineering verdicts
Error scenario pass/fail matrix

Grafana Dashboard

The load test environment includes a pre-configured Grafana dashboard for real-time monitoring.

Setup

# Start the full load test stack
cd apps/strait
docker compose -f docker-compose.loadtest.yml up -d

# Open Grafana (default: admin/admin)
open http://localhost:3001

The dashboard shows:

Queue depth and active workers
Throughput and dispatch latency (P50/P95/P99)
Error rates and worker pool utilization
Database connection pool breakdown
Webhook delivery metrics
Go runtime (goroutines, heap memory, GC pauses)

Prometheus scrapes Strait’s /metrics endpoint every 5 seconds, so panels update in near real-time during load tests.

Understanding Your Results

What each metric means

Metric	Good	Warning	Action
Max throughput	> 1,000/sec	< 500/sec	Check DB connection pool, query optimization
P99 latency	< 500ms	> 2s	Profile hot paths, check indexes
Error rate	< 0.01%	> 0.1%	Check logs for root cause
Memory trend	Flat over 24h	Linear growth	Check for goroutine or connection leaks
Queue depth	Returns to 0	Growing	Worker count too low or dequeue too slow

Environment variables

Variable	Default	Description
`LOADTEST_STRAIT_URL`	`http://localhost:8080`	Strait API URL
`LOADTEST_INTERNAL_SECRET`	`$INTERNAL_SECRET`	API authentication secret
`LOADTEST_DATABASE_URL`	`$DATABASE_URL`	PostgreSQL connection for metrics
`LOADTEST_REDIS_URL`	`$REDIS_URL`	Redis connection for metrics
`LOADTEST_QUICK`	-	Set to `true` for 15-min quick validation
`LOADTEST_TENANTS`	`500`	Tenant count for production simulation
`LOADTEST_DURATION`	`4h`	Duration for simulation/endurance tests
`LOADTEST_TARGET_RATE`	auto	Override target rate for endurance tests

Tuning Based on Results

Bottleneck	Symptom	Fix
Queue throughput	Queue depth growing, dequeue rate flat	Increase `DB_MAX_CONNS` (default: 50), optimize dequeue query
Concurrent limit	Errors spike at N concurrent	Increase `WORKER_CONCURRENCY`
Memory growth	RSS increases linearly over 24h	Check for leaked goroutines, unclosed connections
Webhook delivery	Webhook latency spiking	Increase `WEBHOOK_CONCURRENCY`, check endpoint health
Database connections	`wait_count` increasing	Increase `DB_MAX_CONNS` (default: 50), add connection pooler

The load test harness uses HTTP keep-alives with connection pooling for realistic measurements that match production client behavior. | Redis memory | Used memory > maxmemory | Increase maxmemory, review eviction policy |

Getting Started

Core Concepts

More Concepts

Configuration

Guides

AI Agents

Integrations

Operations

Development

Prerequisites

Quick Start (15 minutes)

Test Job Images

Full Test Suite

Tier 1: Throughput Ceiling

Tier 2: Concurrency Ceiling

Tier 3: Multi-Tenant Simulation

Tier 3: Breaking Point

Tier 4: Endurance (24 hours)

Tier 5: Chaos Engineering

Error Scenarios

Generating Reports

Grafana Dashboard

Setup

Understanding Your Results

What each metric means

Environment variables

Tuning Based on Results

Getting Started

Core Concepts

More Concepts

Configuration

Guides

AI Agents

Integrations

Operations

Development

Documentation Index

​Prerequisites

​Quick Start (15 minutes)

​Test Job Images

​Full Test Suite

​Tier 1: Throughput Ceiling

​Tier 2: Concurrency Ceiling

​Tier 3: Multi-Tenant Simulation

​Tier 3: Breaking Point

​Tier 4: Endurance (24 hours)

​Tier 5: Chaos Engineering

​Error Scenarios

​Generating Reports

​Grafana Dashboard

​Setup

​Understanding Your Results

​What each metric means

​Environment variables

​Tuning Based on Results

Prerequisites

Quick Start (15 minutes)

Test Job Images

Full Test Suite

Tier 1: Throughput Ceiling

Tier 2: Concurrency Ceiling

Tier 3: Multi-Tenant Simulation

Tier 3: Breaking Point

Tier 4: Endurance (24 hours)

Tier 5: Chaos Engineering

Error Scenarios

Generating Reports

Grafana Dashboard

Setup

Understanding Your Results

What each metric means

Environment variables

Tuning Based on Results