Documentation Index
Fetch the complete documentation index at: https://docs.strait.dev/llms.txt
Use this file to discover all available pages before exploring further.
Reference Benchmarks
These benchmarks are from our internal load tests. Your results will vary based on hardware, network, and workload mix. Run the load testing suite on your own infrastructure for accurate numbers.
Local Development Benchmarks
Tested on a MacBook Pro (M-series, shared CPU), Strait running with DB_MAX_CONNS=50, WORKER_CONCURRENCY=25, PostgreSQL and Redis co-located.
| Test | Result | Bottleneck |
|---|
| Throughput Ceiling | 70 jobs/sec sustained, breaks at 80 | Queue depth > 10K |
| Concurrency Ceiling | 350 concurrent, breaks at 400+ | Test context timeout |
| Quick Validation | 80 jobs/sec sustained | P99 latency > 5s at 90/sec |
| Total Operations (Throughput) | 21,594 with 0 errors | - |
| Total Operations (Concurrency) | 350,863 with 0 errors | - |
These numbers represent a floor, not a ceiling. Production deployments on dedicated hardware with DB_MAX_CONNS=100 will achieve significantly higher throughput. Run the load testing suite on your own infrastructure for accurate numbers.
Hardware Sizing Guide
| Customers | Runs/Day | Recommended Hardware | PostgreSQL | Redis |
|---|
| 1-100 | < 50K | 1 vCPU, 1GB RAM | Shared instance | Shared instance |
| 100-500 | 50-500K | 2 vCPU, 4GB RAM | Dedicated, 2 vCPU | Dedicated, 1GB |
| 500-2,000 | 500K-5M | 4 vCPU, 8GB RAM | Dedicated + read replica | Dedicated, 2GB |
| 2,000+ | 5M+ | Horizontal workers | Dedicated + read replicas | Cluster mode |
PostgreSQL Tuning
Connection Pool
The most common bottleneck. Strait uses pgx/v5 connection pooling.
| Setting | Default | Recommended (< 500 tenants) | Recommended (> 500 tenants) |
|---|
DB_MAX_CONNS | 50 | 50 | 50-100 |
DB_MIN_CONNS | 10 | 10 | 10-25 |
DB_MAX_CONN_LIFETIME | 30m | 30m | 15m |
DB_MAX_CONN_IDLE_TIME | 5m | 5m | 2m |
DB_STATEMENT_TIMEOUT | 30s | 30s | 15s |
Connection Budget
When running multiple Fly machines, the total database connections equals DB_MAX_CONNS * number_of_machines. Ensure your PostgreSQL max_connections exceeds this sum.
For example, with DB_MAX_CONNS=100 and 4 machines, you need at least 400 connections on the primary. PlanetScale PostgreSQL supports 1000 connections per primary and per replica by default.
PostgreSQL Server
# postgresql.conf recommendations for Strait workloads
shared_buffers = 256MB # 25% of available RAM
effective_cache_size = 768MB # 75% of available RAM
work_mem = 16MB # Per-operation memory
maintenance_work_mem = 128MB # For VACUUM, CREATE INDEX
# WAL settings
wal_buffers = 16MB
checkpoint_completion_target = 0.9
max_wal_size = 2GB
# Connection limits
max_connections = 200 # Must exceed DB_MAX_CONNS * worker_count
Redis Tuning
| Setting | Default | Recommended |
|---|
maxmemory | No limit | 512MB-2GB |
maxmemory-policy | noeviction | noeviction (Strait manages TTLs) |
timeout | 0 | 300 |
tcp-keepalive | 300 | 60 |
Worker Configuration
| Setting | Default | Effect |
|---|
WORKER_CONCURRENCY | 25 | Starting parallel job executions per worker (auto-scales to ADAPTIVE_CONCURRENCY_MAX) |
MAX_DEQUEUE_BATCH_SIZE | 10 | Jobs claimed per dequeue cycle |
DEQUEUE_STRATEGY | priority | priority or fifo |
DEFAULT_JOB_TIMEOUT_SECS | 300 | Default job timeout |
DEFAULT_JOB_MAX_ATTEMPTS | 3 | Default retry count |
Scaling Workers
For horizontal scaling, run multiple worker processes:
# Worker 1
strait --mode worker
# Worker 2
strait --mode worker
# API (separate)
strait --mode api
Each worker independently dequeues from PostgreSQL using SELECT ... FOR UPDATE SKIP LOCKED, so they naturally load-balance.
Cost Estimation
Compute Cost per 1M Runs
| Component | HTTP-Mode Jobs | Managed (Docker) Jobs |
|---|
| Strait worker CPU | ~0.5 vCPU-hours | ~2 vCPU-hours |
| PostgreSQL IOPS | ~50K reads, ~20K writes | ~80K reads, ~30K writes |
| Redis operations | ~100K commands | ~200K commands |
| Network egress | ~1GB | ~5GB |
Monitoring Key Metrics
Track these metrics to predict when you need to scale:
- Queue depth - If consistently > 0, add workers
- DB connection wait count - If increasing, raise
DB_MAX_CONNS
- Worker CPU utilization - If > 70%, add worker instances
- P99 latency trend - If increasing over days, investigate query performance
- Memory RSS - If growing linearly, check for leaks
Fly.io Deployment Sizing
For Fly.io deployments:
| Scale | Machine Size | Count | Region |
|---|
| Starter | shared-cpu-2x (1024MB) | 1 combined | Single |
| Growth | shared-cpu-2x (2048MB) | 1 API + 1 worker | Single |
| Scale | performance-2x (4GB) | 1 API + 2 workers | Multi-region |
| Enterprise | performance-4x (8GB) | 2 API + 4 workers | Multi-region |
Running Your Own Benchmarks
Use the included load testing framework with Grafana dashboards for real-time visualization:
# Start the load test environment (Postgres, Redis, Prometheus, Grafana)
cd apps/strait
docker compose -f docker-compose.loadtest.yml up -d
# Start Strait
DATABASE_URL="postgres://strait:strait@localhost:5432/strait?sslmode=disable" \
REDIS_URL="redis://localhost:6379" go run ./cmd/strait
# Run the quick validation
LOADTEST_QUICK=true go test -tags=loadtest -run TestQuickValidation \
-timeout 15m ./internal/loadtest/...
# View results in Grafana
open http://localhost:3001