Testing
15.1 API Testing
Test every protocol at the HTTP/contract level. Each protocol has specific tools and scenarios.
REST and GraphQL
Use Playwright APIRequestContext for HTTP-level tests. It runs without a browser and is fast.
Validate response shape with Pydantic models — not just status codes.
response = await request.get("/api/v1/products/42")
product = ProductSchema.model_validate(response.json())
assert product.id == 42
Required test scenarios:
| Scenario | Expected status |
|---|---|
| Valid request | 200 |
| Missing required field | 400 or 422 |
| No auth token | 401 |
| Valid token, wrong role | 403 |
| Resource does not exist | 404 |
| Too many requests | 429 |
Test all six scenarios for every critical endpoint.
gRPC
Use the grpcio client with pytest. Generate stubs from .proto files.
stub = ProductServiceStub(channel)
response = stub.GetProduct(GetProductRequest(id=42))
product = ProductModel.model_validate(MessageToDict(response))
assert product.id == 42
Required test scenarios:
- Unary call — single request, single response.
- Server streaming — receive multiple messages, validate each.
- Error status codes —
NOT_FOUND,INVALID_ARGUMENT,PERMISSION_DENIED. - Deadline exceeded — set a short deadline, verify the client handles it.
WebSocket
Two tools cover both contexts:
- Playwright
page.on("websocket", ...)— for browser-initiated connections; captures frames. websocketsPython library — for server-to-server or pure backend tests.
async with websockets.connect("ws://localhost:8000/ws") as ws:
await ws.send(json.dumps({"type": "subscribe", "channel": "prices"}))
msg = json.loads(await asyncio.wait_for(ws.recv(), timeout=2.0))
assert msg["type"] == "price_update"
Required test scenarios:
- Connect/disconnect — verify clean state on both sides.
- Message validation — reject messages with invalid schema.
- Reconnect logic — server drops connection; client reconnects.
- ACK flow — server sends ACK for every received message.
15.2 Integration Testing
Test the full request path through multiple services. Do not mock the database or downstream services — use real ones.
Setup:
- Spin up dependent services with Docker Compose.
- Use testcontainers-python for databases — each test run gets a clean DB.
- Seed data in fixtures, clean up after each test.
What to test:
- The full path:
client → API gateway → service → DB. - Side effects: data stored, events emitted, downstream services called.
Common integration scenarios:
User registers → profile record created → welcome email queued
Order placed → inventory decremented → order confirmation event published
Auth token revoked → next request with that token returns 401
Key rule: an integration test that passes with mocks but fails with the real service is worthless. Run against real dependencies.
15.3 Load Testing
Test system behavior under high concurrency and sustained traffic. Run before every major release.
Tools:
| Tool | Best for |
|---|---|
| k6 | Scriptable JS, CI-friendly, rich metrics |
| Locust | Python, flexible scenario scripting |
| wrk | Raw HTTP throughput benchmarking |
Required scenarios:
- Ramp-up test — gradually increase virtual users until error rate rises or latency spikes. Find the breaking point.
- Soak test — constant moderate load for 30–60 minutes. Detect memory leaks, connection pool exhaustion, slow DB query accumulation.
- Spike test — sudden burst to 10× normal load. Verify autoscaling activates and error rate stays within SLO.
What to measure:
p50 / p95 / p99latency at each load level — p99 shows the worst real-user experience.- Error rate — target < 0.1% under normal load.
- Throughput (RPS) — find max sustainable RPS.
- Resource usage — CPU, memory, DB connection pool saturation.
Store results per release. Compare with previous baseline to detect regressions.
15.4 Chaos Testing
Inject failures into a running system to verify resilience. The goal is to find weaknesses before production incidents do.
Scenarios to test:
| Failure | Expected behavior |
|---|---|
| Kill one service instance | Traffic routes to healthy instances |
| Add 200 ms latency to DB | Circuit breaker activates, fallback served |
| Return 500 from external service | Retry + error handling works |
| Drop network packets between services | Timeout triggers, not a hang |
Tools:
- Chaos Monkey — random instance termination (AWS).
- Gremlin — SaaS platform, fine-grained failure injection.
- Chaos Toolkit — open-source, scriptable experiments.
- Litmus — Kubernetes-native chaos experiments.
Validate after each experiment:
- Error rate stays below SLO during the failure.
- Circuit breaker opens at the correct threshold.
- Graceful degradation returns correct fallback data.
- Metrics and alerts fire as expected.
Where to run:
- Staging: run full destructive experiments freely.
- Production: run controlled, small-blast-radius experiments during low-traffic windows. Always have a rollback plan.