gRPC: Testing Plan and Risks
3.7 Testing Plan
Contract Tests
- Backward compatibility: add new field to
.proto — old client still works correctly
- Forward compatibility: old server handles messages with new unknown fields gracefully
- Breaking changes detection: removed field, changed type, reused field number — must fail at compile time
- Reserved fields: using a reserved field number — protoc reports compilation error
- Proto linting: use
buf lint or protolint to enforce style and compatibility rules
Serialization Tests
- Encode/decode roundtrip: message -> bytes -> message = identical values
- Default values: unset fields return default values (0 for int, "" for string, false for bool)
- Unknown fields: preserved during deserialization, forwarded correctly
- Large payloads: messages near max size limit (default 4 MB) — handled without error
- Edge cases: empty strings, zero values, maximum int values, unicode text
RPC Behavior Tests
- Unary: send request, receive correct response, verify all fields
- Server streaming: send request, receive correct sequence of messages, verify proper stream completion
- Client streaming: send multiple messages, receive correct aggregated response
- Bidirectional: messages flow in both directions, verify correct ordering
- Cancellation: client cancels mid-stream — server receives cancellation signal, stops processing
- Deadline exceeded: slow server response — client gets
DEADLINE_EXCEEDED status
Error Handling Tests
- Each status code returned for the correct failure scenario
- Rich error details accessible from client (BadRequest, RetryInfo fields)
- Metadata present in error responses (request-id, trace headers)
- Invalid request data returns
INVALID_ARGUMENT with helpful message
- Missing auth token returns
UNAUTHENTICATED
- Latency: unary call p50 / p95 / p99 under normal load
- Throughput: maximum RPCs per second before degradation
- Streaming throughput: messages per second in server and bidirectional streams
- Payload size impact: latency increase as message size grows
- Connection scaling: performance with 1, 10, 100 concurrent connections
Connection Tests
- Keepalive: connection stays alive during idle periods with ping/pong
- Reconnect: client reconnects automatically after connection drop
- Timeout: configurable deadline per RPC — works correctly
- Connection reuse: multiple RPCs share one TCP connection efficiently
- Graceful shutdown: server drains active RPCs before stopping
Security Tests
- mTLS: mutual TLS authentication between client and server
- Auth metadata: token in metadata, verified by server interceptor
- Unauthorized request: missing or invalid token returns
UNAUTHENTICATED
- Certificate rotation: new certificates accepted without restart
- Channel encryption: all data encrypted in transit (no plaintext)
Observability Tests
- Structured logging: each RPC call produces a log entry with method, status, duration, request_id
- Trace propagation: OpenTelemetry trace context passes through interceptors to downstream services
- Span creation: each RPC creates a span with correct parent-child relationship
- Error logging: failed RPCs log status code, error message, and metadata
- Metrics export: request count, error rate, and latency histograms are exported to Prometheus
Backpressure Tests
- Flow control: fast sender does not overwhelm slow receiver
- Slow consumer: server streaming to slow client — flow control activates, sender pauses
- Buffer limits: message exceeds max size (default 4 MB) — returns
RESOURCE_EXHAUSTED
- Memory usage: streaming large datasets does not cause OOM
3.8 Risks and Limitations
Infrastructure Risks
- HTTP/2 dependency: gRPC requires HTTP/2. Some proxies and load balancers
do not fully support it (e.g., older AWS ALB, some CDNs)
- Browser limitations: browsers do not natively support gRPC.
You need grpc-web proxy (Envoy) to bridge HTTP/1.1 to HTTP/2
- Infrastructure complexity: need protoc toolchain, generated code management,
HTTP/2-capable infrastructure across the stack
Development Risks
- Binary debugging: you cannot read protobuf with curl or a browser.
Need specialized tools:
grpcurl, Postman gRPC, Kreya, or Evans
- Tight coupling: client and server share the
.proto contract.
Schema changes need coordination across teams
- Schema rigidity: strict types mean less flexibility than JSON.
Adding optional context or metadata requires schema updates
- Learning curve: teams familiar with REST need training on protobuf,
streaming patterns, and gRPC tooling
Runtime Risks
- Streaming pitfalls: streams cannot be load-balanced once started.
Long-lived streams hold server resources and complicate scaling
- Backpressure complexity: flow control is automatic but misconfigured
buffer sizes cause hangs or OOM errors
- Observability challenges: binary format makes logging harder.
Need structured logging with decoded messages and proper tracing
- Connection management: too few connections = bottleneck,
too many = resource waste. Requires careful tuning