Web Socket Protocol - An Overview

·

4 min read

Prerequisites: Basic Knowledge of how protocols like HTTP/1.0, HTTP/1.1 and HTTP/2.0 work including basic knowledge of a TLS handshake.

WebSocket is a realtime technology that enables bidirectional, full-duplex communication between client and server over a persistent, single-socket connection. This basically means that the client and the server are connected via a two way, stateful connection over which they both can send data to each other over the same connection unlike in a normal, stateless HTTP request in which the connection is closed as soon as the server gives the response.

WebSocket technology is made of two components:

  1. WebSocket Protocol (the protocol on which websocket work)

  2. WebSocket API (the means to implement this protocol through code)

The Websocket Protocol

With the standardisation of HTTP/1.1 protocol, Websockets became popular as HTTP/1.0 did not support Websockets. In a HTTP/1.1 protocol, it starts with a GET request with a Upgrade header which tells the server that the client wants to upgrade to the websocket protocol during the handshake and server responds with 101 status code if the request is sucessful. Here is an example of a websocket request over HTTP/1.1 -

Breakdown of a Webosocket Request over HTTP/1.1

Here is the breakdown of the above request explaining websocket specific headers indicating version, subprotocols, and extensions:

  1. GET ws://localhost:8085/HTTP/1.1: This is the initial line of the HTTP request. It specifies the method (GET), the URI (ws://localhost:8085/), and the HTTP version (HTTP/1.1). Here, ws indicates that the request is attempting to establish a WebSocket connection.

  2. Connection: Upgrade: This header indicates that the client wants to upgrade the connection to a different protocol. In this case, it's asking to upgrade to the WebSocket protocol.

  3. Sec-WebSocket-Version: 13: This header specifies the version of the WebSocket protocol that the client supports.

  4. Sec-WebSocket-Key: vLIkL0xhJVCGjn7M3PPjQw==: This header contains a base64-encoded value that the server will use in its response to prove that it supports WebSocket connections.

  5. Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits: This header specifies the WebSocket extensions that the client supports.

Websocket Connections over HTTP/2.0

As we know that HTTP/2.0, improved upon HTTP/1.x by introducing streams of requests over a single TCP connection (multiplexing) but it did not include any support for the websocket protocol until 2019. The WebSocket connection itself is unchanged between HTTP/2.0 and HTTP/1.1 except that the WebSocket frames are wrapped in HTTP/2.0 data frames (providing the stream id and flags specific to HTTP/2.0). The only differences are in how the websocket connection is established.

In HTTP/2.0 a websocket is established via a similar process where a client sends a HTTP CONNECT request and the server responds with a 200 response. The first difference is that the Sec-WebSocket-Key is no longer needs to be exchanged as the :protocol pseudo header serves the same purpose. The second difference is that only a single stream becomes the websocket rather than the entire connection thereby allowing HTTP/2.0 multiplexing to continue.

HTTP/2.0 WebSocket request:

Web Socket (ws) and Web Socket Secure (wss)

ws and wss are both URI schemes used to indicate the protocol used for a WebSocket connection. Here's the difference between the two:

  1. ws (WebSocket): This URI scheme indicates that the connection should be made using the WebSocket protocol over a non-secure connection. It operates on the standard WebSocket port, which is typically 80 for HTTP and 443 for HTTPS.

  2. wss (WebSocket Secure): This URI scheme indicates that the connection should be made using the WebSocket protocol over a secure connection. It operates on the standard WebSocket secure port, which is typically 443. wss is similar to https, indicating that the TLS handshake happens before the upgrade or connect request is sent to the server.

Pros And Cons of Using WebSockets

WebSockets are more efficient than traditional HTTP polling or long-polling methods, as they establish a persistent connection that eliminates the overhead of establishing new connections for each request, achieving low latency and enabling cross-domain communication.

Even though Whatsapp has decalred that it has 3 Million active websocket connections on every server in 2016 only, scaling websockets horizontally is a complex system design problem. The stateful nature of connections and the need for connections to be consistently persistent uses more resources than simple HTTP request consdering large number of clients.

There have been various improvements and advancements over the websocket protocol. The advent of gRPC and WebRTC protocols has become an alternative to websockets. Also, the websocket protocol over HTTP/3.0 which uses QUIC transport protocol, providing lower latency compared to TCP is an improvement to the latter itself.

Use Cases of WebSockets

Websockets are mainly used where a duplex, stateful and persistent connection is required between the client and the server such as chat applications like WhatsApp, where real time messaging happens or in collaborative editing where multiple users edit a page with real time updation. Real time gaming, financial trading and live streaming are some other common use cases.

This concludes the overview of WebSocket technology. Happy Learning !!