Skip to content

Performance Testing and Observability in Tests

Performance Testing

Metrics

Metric Definition Typical SLO
Latency (p50) Median response time < 100 ms
Latency (p95) 95th percentile < 300 ms
Latency (p99) 99th percentile < 1 s
Throughput Requests per second > 500 rps
Error rate % of failed requests < 0.1 %
Saturation CPU/Memory under load < 80 %

Load Simulation

Tools: Locust, k6, Artillery.

Locust example:

from locust import HttpUser, task, between

class ApiUser(HttpUser):
    wait_time = between(0.5, 2.0)
    host = "http://localhost:8000"

    @task(3)
    def get_users(self):
        with self.client.get("/users", catch_response=True) as response:
            if response.status_code != 200:
                response.failure(f"Unexpected status: {response.status_code}")

    @task(1)
    def create_user(self):
        payload = {"email": f"user-{id(self)}@example.com", "username": f"u{id(self)}"}
        self.client.post("/users", json=payload)

Run: locust -f locustfile.py --headless -u 100 -r 10 --run-time 60s

Latency Assertions in pytest

Lightweight SLO checks as part of the API test suite:

import time
import httpx


def test_get_users_latency_slo(api_client: httpx.Client):
    start = time.perf_counter()
    response = api_client.get("/users")
    elapsed = time.perf_counter() - start

    assert response.status_code == 200
    assert elapsed < 0.3, f"GET /users took {elapsed:.3f}s, SLO is 300ms"

Stress Testing

Identify breaking point by linearly increasing load until error rate exceeds threshold.

Phase Target RPS Expected
Baseline 10 < 50 ms p95
Normal load 100 < 100 ms p95
Peak load 500 < 300 ms p95
Stress 1000 Graceful degradation
Spike 2000 for 30 s No data loss, recovery

Observability in Tests

Logging

Every test action should produce structured log output for debugging failures.

import logging
import httpx

logger = logging.getLogger(__name__)


def test_create_user(api_client: httpx.Client):
    payload = UserBuilder().as_dict()
    logger.info("Creating user: email=%s", payload["email"])

    response = api_client.post("/users", json=payload)
    logger.debug("Response: status=%d body=%s", response.status_code, response.text[:300])

    assert response.status_code == 201

Configure pytest logging:

[pytest]
log_cli = true
log_cli_level = DEBUG
log_format = %(asctime)s %(levelname)-8s %(name)s %(message)s

Metrics

Track test suite health metrics in CI:

Metric Source
Pass rate per suite pytest-json-report
Flaky test rate Retry count / total runs
Avg test duration pytest-durations
Coverage % pytest-cov

Tracing: Request Correlation

Inject a X-Request-ID header in every test request for correlation with service logs:

import uuid
import httpx


def make_traced_request(client: httpx.Client, method: str, path: str, **kwargs) -> httpx.Response:
    request_id = str(uuid.uuid4())
    headers = kwargs.pop("headers", {})
    headers["X-Request-ID"] = request_id
    response = client.request(method, path, headers=headers, **kwargs)
    return response

In CI, correlate test failures with service traces using the X-Request-ID.


Performance Observability Checklist

Check Tool
SLO latency assertions in CI pytest + time.perf_counter
Load test baseline on main Locust / k6 in CI pipeline
Trace IDs in test logs Custom request wrapper
DB query count assertion SQLAlchemy event listeners
Memory leak detection memray profiler in CI