Skip to content

Interpreting Results

Healthy System

A system behaving correctly under load shows:

Signal What It Looks Like
Stable RPS Flat line at target throughput
Flat latency p95/p99 barely changes as users increase
Low error rate 0% or < 0.1% throughout the test
Stable memory Memory usage plateaus, no upward drift
Consistent CPU CPU proportional to RPS, not spiking
RPS:     ──────────────────── (flat at target)
p95:     ────────────────────
Errors:  ────────────────────
          0s       60s      120s

Degrading System

Early warning signs before a system fails:

Signal Interpretation
Latency slowly increasing Resource is being consumed — find which one
Occasional p99 spikes GC pauses, lock contention, slow queries appearing
RPS variance increasing Processing becomes inconsistent
Memory creeping up Potential memory leak — confirm with soak test
p99:     ___
            \___
                \___________
                             \___
                                  \__________________ (slowly growing)
          0s       60s      120s     180s     240s

This pattern demands investigation before production load reaches these levels.


Overloaded System

System operating beyond capacity:

Signal What Happens
Latency spikes sharply Queue backs up, requests wait for workers
Error rate jumps Timeouts, 503s, connection refused
Throughput plateau Adding users doesn't add RPS — system is saturated
RPS drops despite high user count System spending time recovering, not processing
RPS:     ___________
                    \_____________________  ← plateau then drop
p99:                       _______________
                           /               ← spike
Errors:                    ______________
                           /               ← spike
          0s       60s      120s     180s

The knee point is where this transition begins — the critical finding of any load test.


Reading the Numbers

Scenario: is 200ms p95 acceptable?

It depends on the SLO: - If SLO is p95 < 300ms → pass - If SLO is p95 < 100ms → fail

Always evaluate results against the SLO, not against intuition.

Scenario: RPS is 100 but users are 200

Effective RPS = users × (1 / (response_time + wait_time))

200 users × 1 / (0.4s + 0.6s wait) = 200 RPS possible
If actual RPS = 100 → latency is 1.4s, not 0.4s

High user count with low RPS signals the system is slow, not that Locust is misconfigured.


Result Classification Matrix

p95 Error Rate Conclusion
Within SLO < 0.1% Pass — healthy
Within SLO 0.1–1% Warning — investigate errors
Within SLO > 1% Fail — reliability issue
Exceeds SLO < 0.1% Fail — latency issue
Exceeds SLO > 1% Fail — system overloaded