WebSocket: Reliability Pattern
WebSocket does not provide exactly-once delivery by default. This document describes a practical "exactly-once-like" pattern for real systems.
Message contract
Use these fields in every event:
message_id: unique id for deduplicationseq: monotonic sequence number per streamts: server timestampstream_id: channel/user/session key
Example:
{
"stream_id": "chat:room-42",
"message_id": "msg-8c7f",
"seq": 1052,
"ts": "2026-03-25T12:00:00Z",
"event": "chat.message",
"data": {"text": "hello"}
}
Server responsibilities
- Assign increasing
seqfor each stream. - Keep replay buffer per stream (time-window or count-window).
- Store a dedup key set (
stream_id + message_id) with TTL. - On reconnect, replay from
last_seq + 1. - If requested seq is older than replay window, return resync-required signal.
Client responsibilities
- Persist
last_seqand recentmessage_idset. - Send resume request with
last_seqafter reconnect. - Drop duplicates by
message_id. - Detect gaps (
incoming.seq > last_seq + 1) and request replay.
Replay window design
| Strategy | Example | Trade-off |
|---|---|---|
| Time-based | Keep 5 minutes of events | Simple, memory depends on traffic |
| Count-based | Keep last 10,000 events | Predictable memory, variable time |
| Hybrid | 5 minutes and max 10,000 | Best practical balance |
Failure modes
- Duplicate message after reconnect -> solved by
message_iddedup. - Missing sequence segment -> solved by replay request.
- Replay overflow -> force snapshot/state sync.
- Multi-server ordering drift -> use per-stream sequencing authority.