Sunday, June 28, 2026

Benchmark server performance

Load testing tools

You can measure the real capacity using tools like:

  • wrk (recommended)
  • hey
  • ab (ApacheBench)
  • siege


The real difference in one line each

  • wrk“I need serious performance and control.”
  • hey“I want a modern, simple ab replacement.”
  • ab (ApacheBench)“I just want a quick, basic sanity check.”
  • siege“I want to simulate basic user browsing behavior.”

How to choose (decision logic)

1) Are you doing real performance testing or benchmarking APIs?

Pick wrk

Use it when:

  • You care about throughput, latency percentiles, or saturation limits
  • You want scripting (Lua) for realistic request patterns
  • You’re load testing services under high concurrency

Why:

  • Very high performance event loop model
  • Stable under heavy load (hundreds of thousands–millions of req/sec depending on machine)

👉 Default choice for backend/API teams.


2) Do you want something simple but modern (replacement for ab)?

Pick hey

Use it when:

  • You want a quick load test with sane defaults
  • You don’t want scripting complexity
  • You want something easy to install and run

Why:

  • Designed as a modern, cleaner alternative to ApacheBench
  • Better concurrency model than ab
  • Very easy CLI

👉 Best “quick but not outdated” tool.


3) Do you just need a fast health / latency check?

Pick ApacheBench (ab)

Use it when:

  • You want a 10-second test of an endpoint
  • You’re debugging or validating deployment
  • You don’t care about realism

Why:

  • Installed everywhere (comes with Apache HTTP server tools)
  • Extremely simple

Limitations:

  • Single-threaded bottleneck
  • Not realistic under load
  • No modern metrics (percentiles are weak)

👉 Good for smoke tests only, not real benchmarking.


4) Do you want to simulate users browsing a website?

Pick siege

Use it when:

  • You want multiple URLs hit like a user session
  • You’re testing a web app (not just APIs)
  • You want concurrency + URL lists

Why:

  • Supports URL files and sequences
  • Models “user-like” behavior better than wrk/ab

Limitations:

  • Not as fast or precise as wrk
  • Less scripting flexibility than wrk + Lua

👉 Best for simple web app / CMS testing.


Simple decision table

ToolBest forStrengthWeakness
wrkAPI / serious load testingExtremely fast + scriptableSlight learning curve
heyquick modern load testSimple + clean CLILess powerful than wrk
absmoke testUbiquitous + minimalOutdated, not realistic
siegeweb browsing simulationMulti-URL user flowsSlower, less precise

Practical recommendation (2026 reality)

If you only pick one:

👉 Choose wrk for anything beyond quick checks.

Then optionally:

  • use hey when you want speed + simplicity
  • use ab only for debugging or CI smoke tests
  • use siege when testing pages, not APIs

One mental model that helps

  • wrk = load generator for engineers
  • hey = modern ab
  • ab = legacy quick probe
  • siege = fake browser traffic



Example using wrk

$ wrk -t2 -c200 -d30s --latency https://site 
 
Running 30s test @ https://ste
2 threads and 200 connections 
Thread Stats Avg Stdev Max +/- Stdev 
Latency 169.09ms 83.75ms 1.19s 92.01% 
Req/Sec 348.95 228.54 808.00 57.76% 
Latency Distribution
 50% 149.60ms 
 75% 177.89ms
 90% 223.78ms
 99% 600.32ms
 20206 requests in 30.03s, 112.65MB read 
Requests/sec: 672.88 
Transfer/sec: 3.75MB
 
 

This run gives a much clearer picture of your server's performance.

MetricValueInterpretation
Average latency169 msTypical request completes in under 0.2 s.
Median (50th)150 msHalf of all requests finish within 150 ms.
75th percentile178 msThree-quarters finish within 178 ms.
90th percentile224 ms90% of requests finish in under a quarter second.
99th percentile600 msThe slowest 1% take up to about 0.6 s.
Max latency1.19 sA few outliers are significantly slower.
Throughput673 req/sOverall request rate with 200 concurrent clients.

What the latency distribution says

The percentile breakdown is often the most useful part:

  • 50%: 150 ms
  • 75%: 178 ms
  • 90%: 224 ms
  • 99%: 600 ms

This tells you that performance is fairly consistent for most users. The jump from the 90th percentile (224 ms) to the 99th (600 ms) indicates a small "long tail" of slower requests, which is common for web applications.

Is it good?

For a typical dynamic web app:

  • ✅ Median around 150 ms is good.
  • ✅ 90th percentile under 250 ms is also good.
  • ⚠️ The 99th percentile at 600 ms suggests occasional delays. Depending on your application, this may or may not be worth investigating.

Throughput

With 200 concurrent connections:

  • 673 requests/sec
  • 3.75 MB/sec transferred

This is a solid result for an application that does meaningful work per request. If your site is serving mostly static assets, you'd expect much higher throughput from a tuned web server or CDN.

No comments:

Blog Archive

Followers