Skip to content

gRPC: Testing Plan and Risks

3.7 Testing Plan

Contract Tests

  • Backward compatibility: add new field to .proto — old client still works correctly
  • Forward compatibility: old server handles messages with new unknown fields gracefully
  • Breaking changes detection: removed field, changed type, reused field number — must fail at compile time
  • Reserved fields: using a reserved field number — protoc reports compilation error
  • Proto linting: use buf lint or protolint to enforce style and compatibility rules

Serialization Tests

  • Encode/decode roundtrip: message -> bytes -> message = identical values
  • Default values: unset fields return default values (0 for int, "" for string, false for bool)
  • Unknown fields: preserved during deserialization, forwarded correctly
  • Large payloads: messages near max size limit (default 4 MB) — handled without error
  • Edge cases: empty strings, zero values, maximum int values, unicode text

RPC Behavior Tests

  • Unary: send request, receive correct response, verify all fields
  • Server streaming: send request, receive correct sequence of messages, verify proper stream completion
  • Client streaming: send multiple messages, receive correct aggregated response
  • Bidirectional: messages flow in both directions, verify correct ordering
  • Cancellation: client cancels mid-stream — server receives cancellation signal, stops processing
  • Deadline exceeded: slow server response — client gets DEADLINE_EXCEEDED status

Error Handling Tests

  • Each status code returned for the correct failure scenario
  • Rich error details accessible from client (BadRequest, RetryInfo fields)
  • Metadata present in error responses (request-id, trace headers)
  • Invalid request data returns INVALID_ARGUMENT with helpful message
  • Missing auth token returns UNAUTHENTICATED

Performance Tests

  • Latency: unary call p50 / p95 / p99 under normal load
  • Throughput: maximum RPCs per second before degradation
  • Streaming throughput: messages per second in server and bidirectional streams
  • Payload size impact: latency increase as message size grows
  • Connection scaling: performance with 1, 10, 100 concurrent connections

Connection Tests

  • Keepalive: connection stays alive during idle periods with ping/pong
  • Reconnect: client reconnects automatically after connection drop
  • Timeout: configurable deadline per RPC — works correctly
  • Connection reuse: multiple RPCs share one TCP connection efficiently
  • Graceful shutdown: server drains active RPCs before stopping

Security Tests

  • mTLS: mutual TLS authentication between client and server
  • Auth metadata: token in metadata, verified by server interceptor
  • Unauthorized request: missing or invalid token returns UNAUTHENTICATED
  • Certificate rotation: new certificates accepted without restart
  • Channel encryption: all data encrypted in transit (no plaintext)

Observability Tests

  • Structured logging: each RPC call produces a log entry with method, status, duration, request_id
  • Trace propagation: OpenTelemetry trace context passes through interceptors to downstream services
  • Span creation: each RPC creates a span with correct parent-child relationship
  • Error logging: failed RPCs log status code, error message, and metadata
  • Metrics export: request count, error rate, and latency histograms are exported to Prometheus

Backpressure Tests

  • Flow control: fast sender does not overwhelm slow receiver
  • Slow consumer: server streaming to slow client — flow control activates, sender pauses
  • Buffer limits: message exceeds max size (default 4 MB) — returns RESOURCE_EXHAUSTED
  • Memory usage: streaming large datasets does not cause OOM

3.8 Risks and Limitations

Infrastructure Risks

  • HTTP/2 dependency: gRPC requires HTTP/2. Some proxies and load balancers do not fully support it (e.g., older AWS ALB, some CDNs)
  • Browser limitations: browsers do not natively support gRPC. You need grpc-web proxy (Envoy) to bridge HTTP/1.1 to HTTP/2
  • Infrastructure complexity: need protoc toolchain, generated code management, HTTP/2-capable infrastructure across the stack

Development Risks

  • Binary debugging: you cannot read protobuf with curl or a browser. Need specialized tools: grpcurl, Postman gRPC, Kreya, or Evans
  • Tight coupling: client and server share the .proto contract. Schema changes need coordination across teams
  • Schema rigidity: strict types mean less flexibility than JSON. Adding optional context or metadata requires schema updates
  • Learning curve: teams familiar with REST need training on protobuf, streaming patterns, and gRPC tooling

Runtime Risks

  • Streaming pitfalls: streams cannot be load-balanced once started. Long-lived streams hold server resources and complicate scaling
  • Backpressure complexity: flow control is automatic but misconfigured buffer sizes cause hangs or OOM errors
  • Observability challenges: binary format makes logging harder. Need structured logging with decoded messages and proper tracing
  • Connection management: too few connections = bottleneck, too many = resource waste. Requires careful tuning