annotate mrjunejune/src/blog/websocket-demystified/index.md @ 216:e82b80b24012 default tip

[MrJuneJune] Make webp translate background job.
author June Park <parkjune1995@gmail.com>
date Sat, 28 Feb 2026 21:04:43 -0800
parents 295ac2e5ec00
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
169
295ac2e5ec00 [MrJuneJune] Created separate target for generating html from md.
MrJuneJune <me@mrjunejune.com>
parents: 133
diff changeset
1 ---
295ac2e5ec00 [MrJuneJune] Created separate target for generating html from md.
MrJuneJune <me@mrjunejune.com>
parents: 133
diff changeset
2 title: WebSocket Demystified
295ac2e5ec00 [MrJuneJune] Created separate target for generating html from md.
MrJuneJune <me@mrjunejune.com>
parents: 133
diff changeset
3 description: A deep dive into WebSocket internals, debunking the 65535 port myth and building a WebSocket implementation from scratch.
295ac2e5ec00 [MrJuneJune] Created separate target for generating html from md.
MrJuneJune <me@mrjunejune.com>
parents: 133
diff changeset
4 ---
295ac2e5ec00 [MrJuneJune] Created separate target for generating html from md.
MrJuneJune <me@mrjunejune.com>
parents: 133
diff changeset
5
130
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
6 # WebSocket Demystified
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
7
133
902e29c38d66 [Blog] Final copy for websocket one.
June Park <parkjune1995@gmail.com>
parents: 132
diff changeset
8 WebSockets have been around for more than 10 years now. The [RFC 6455](https://www.rfc-editor.org/rfc/rfc6455) was dropped way back in 2011. This was inevitable. As apps got more complex, people wanted real bidirectional communication without resorting to hacky solutions like raw TCP connections with custom keys or the constant overhead of short/long polling.
130
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
9
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
10 Today, it’s the standard for everything from chat apps to LLM interfaces, where the model streams bytes back to you one token at a time as it predicts the next word.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
11
132
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
12 Most developers just grab a library like [ws](https://github.com/websockets/ws) for Node.js or [websockets](https://websockets.readthedocs.io/) for Python and call it a day. But many don’t realize the underlying mechanism is actually pretty simple to implement yourself in a day or so. Also they misunderstand what websocket actually do. Let's look at how to build it from scratch and debunk myths.
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
13
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
14 ### The "65,535" Myth
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
15
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
16 Before we write a single line of code, Let's talk about common myth. You’ll often hear developers say, *"A server can only handle 65,535 WebSocket connections because there are only 65,535 ports."*
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
17
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
18 If this were true, Discord or Slack would need millions of separate IP addresses just to function. The confusion comes from the 16-bit size of the TCP port field, but a connection isn't defined by a port alone. You will see in this blog.
130
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
19
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
20 ---
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
21
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
22 ## Requirements
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
23
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
24 * Ability to type
132
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
25 * Half a brain to real mechanics, not the surface-level myths.
130
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
26 * A computer
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
27
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
28 ---
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
29
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
30 ## The Lifecycle
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
31
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
32 To get a WebSocket up and running, you have to follow a specific dance. It’s not just "connecting to a port"; it's an evolution of an existing relationship.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
33
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
34 1. **The Handshake:** A client sends a "pretty please" HTTP request asking to upgrade the connection.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
35 2. **The Response:** The server agrees (101 Switching Protocols) and sends back a specific hash.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
36 3. **The Switch:** Both sides stop talking "HTTP" and start talking "Frames."
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
37 4. **The Interaction:** Bidirectional, binary-framed messaging until someone closes the door.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
38
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
39 ---
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
40
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
41 ## Opening Handshakes
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
42
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
43 To start the upgrade from HTTP to WebSocket, the client sends a standard GET request but with some very specific headers.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
44
132
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
45 <div class="center"> <img src="/public/web-socket-header.webp" /> </div>
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
46
131
b230a743a01e Added blog.
June Park <parkjune1995@gmail.com>
parents: 130
diff changeset
47
130
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
48 I’m assuming you know how HTTP works. If not, you can open a developer tool by right clicking on your browser and seeing into network tab and refershign the page. The only interesting values here is the `Sec-WebSocket-Key`. This key is usually a 16-byte random value encoded in **Base64**.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
49
133
902e29c38d66 [Blog] Final copy for websocket one.
June Park <parkjune1995@gmail.com>
parents: 132
diff changeset
50 **Note:** It’s not for security. It’s to prevent intermediate caches from accidentally serving a cached WebSocket response to a different client.
130
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
51
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
52 But before we jump into that, we need to construct that Base64 key.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
53
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
54 ### What is Base64?
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
55
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
56 Let's ask Gemini:
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
57
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
58 > "Base64 is a binary-to-text encoding scheme that represents data in an ASCII string format by translating it into a radix-64 representation, using a specific set of 64 printable characters." — Gemini
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
59
131
b230a743a01e Added blog.
June Park <parkjune1995@gmail.com>
parents: 130
diff changeset
60
130
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
61 Nice, it didn't halluciante. Let's constracut these. Here they are 64 characters that are safe to print in ASCII.:
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
62
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
63 ```c
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
64 static const char base64_chars[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
65
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
66 ```
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
67
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
68 In Python, this is a one-liner:
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
69
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
70 ```python
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
71 import base64
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
72 import os
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
73 print(base64.b64encode(os.urandom(16)))
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
74
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
75 ```
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
76
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
77 But we are rewriting this from scratch in **C**, so we need to suffer a little. The logic is: take 3 bytes (24 bits) and split them into 4 chunks of 6 bits each. Each 6-bit chunk becomes an index into our `base64_chars` array.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
78
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
79 #### Step 1: Generate Random Bytes
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
80
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
81 First, we grab 16 random bytes.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
82
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
83 ```c
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
84 srand((unsigned int)time(NULL));
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
85 uint8 random_value[16];
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
86
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
87 for (int i = 0; i < 16; i++)
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
88 random_value[i] = (uint8)(rand() % 256);
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
89 ```
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
90
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
91 #### Step 2: The Bit-Shifting Magic
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
92
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
93 We loop through our 16 bytes in groups of 3. We pack them into a 32-bit integer, then carve that integer into 6-bit slices.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
94
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
95 > **Note:** The length isn't strictly defined as 32, but many implementations land there.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
96
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
97 ```c
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
98 char result[32] = {0};
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
99 int32 result_index = 0;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
100
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
101 for (int i = 0; i < 16; i += 3) {
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
102 uint32 first_value = 0, second_value = 0, third_value = 0;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
103
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
104 if (i < 15) {
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
105 first_value = (uint32)random_value[i] << 16;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
106 second_value = (uint32)random_value[i+1] << 8; // Fixed logic from original draft
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
107 third_value = (uint32)random_value[i+2];
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
108 } else {
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
109 // Handle the trailing bytes (padding logic usually goes here)
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
110 first_value = (uint32)random_value[i] << 16;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
111 }
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
112
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
113 uint32 group_value = first_value | second_value | third_value;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
114
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
115 // Map bits to characters: 0x3F is 0011 1111 (keeps only 6 bits)
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
116 result[result_index++] = base64_chars[(group_value >> 18) & 0x3F];
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
117 result[result_index++] = base64_chars[(group_value >> 12) & 0x3F];
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
118 result[result_index++] = base64_chars[(group_value >> 6) & 0x3F];
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
119 result[result_index++] = base64_chars[group_value & 0x3F];
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
120 }
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
121
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
122 ```
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
123
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
124 Now you have a `Sec-WebSocket-Key`. When the server gets it, it appends a "Magic String" (`258EAFA5-E914-47DA-95CA-C5AB0DC85B11`), SHA1 hashes it, and Base64 encodes it back to you as `Sec-WebSocket-Accept`.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
125
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
126 ---
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
127
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
128 ## Upgrading the Protocol
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
129
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
130 If you are the client, you just wait for that `101 Switching Protocols` response. Once you see it, you stop sending HTTP text and start sending frames.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
131
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
132 If you are the **server**, you need to keep that connection alive. In my project, `Seobeo`, I create a separate connection object, throw away the HTTP request info to save memory, and start a fresh buffer for WebSocket frames.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
133
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
134 ```c
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
135 // Transitioning the state from HTTP to WebSocket
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
136 Seobeo_WebSocket_Server_Connection *p_conn = malloc(sizeof(Seobeo_WebSocket_Server_Connection));
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
137 memset(p_conn, 0, sizeof(Seobeo_WebSocket_Server_Connection));
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
138
132
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
139 p_conn->p_handle = p_handle; // file descriptor
130
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
140 p_conn->is_active = TRUE;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
141 p_conn->fragment_capacity = 4096;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
142 p_conn->fragment_buffer = malloc(p_conn->fragment_capacity);
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
143
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
144 ```
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
145
132
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
146 ### Wait, what is the p_handle or file descriptor here??
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
147
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
148 The File Descriptor (FD) is the internal ID badge your server assigns to a connection. The OS identifies a unique connection via a 4-tuple; Source IP, Source Port, Destination IP, Destination Port. You can think of the OS as a giant hashmap that links these 4-tuples to an integer (the FD).
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
149
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
150 So the limits are from the number of FD and RAM capactiy. You can check your system's FD limit with:
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
151
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
152 ```bash
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
153 ulimit -n
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
154 ```
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
155
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
156 Now, we have debunked this myth. Let's see how the protocol actually works.
7a63e41a21fb [Seobeo] Added debug targets.
June Park <parkjune1995@gmail.com>
parents: 131
diff changeset
157
130
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
158 ---
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
159
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
160 ## Frame-Based Protocols
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
161
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
162 This is where the logic gets "cancerous." WebSockets don't just send raw strings; they wrap everything in a **Frame**.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
163
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
164 ### The Opcode Table
123
3f4ec30e42e0 Added blog files.
June Park <parkjune1995@gmail.com>
parents:
diff changeset
165
130
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
166 The first byte contains the `FIN` bit (is this the end of the message?) and the `Opcode` (what kind of data is this?).
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
167
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
168 | Opcode (Hex) | Meaning | Description |
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
169 | --- | --- | --- |
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
170 | `0x0` | Continuation | Part of a multi-frame message |
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
171 | `0x1` | Text | UTF-8 payload |
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
172 | `0x2` | Binary | Raw binary data |
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
173 | `0x8` | Close | Terminate the connection |
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
174 | `0x9` | Ping | Heartbeat check |
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
175 | `0xA` | Pong | Heartbeat response |
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
176
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
177 ### The Masking Rule
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
178
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
179 * **Client to Server:** MUST be masked.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
180 * **Server to Client:** MUST NOT be masked.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
181 If a client sends unmasked data, the server must close the connection. It’s the law.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
182
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
183 ### Why the Bitwise Mess? (Endianness)
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
184
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
185 In the code below, you’ll see things like `payload_length >> 56`. This is because network protocol headers use **Big-Endian** (most significant byte first). If your computer is Little-Endian (most are), you have to manually shift bits into the right order so the wire sees them correctly.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
186
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
187 ---
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
188
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
189 ## Sending a Frame (Client Side)
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
190
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
191 Here is how we construct a frame to send data to the server.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
192
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
193 ```c
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
194 uint8 frame[14];
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
195 size_t frame_len = 0;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
196
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
197 // Byte 0: FIN bit (0x80) and Opcode
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
198 frame[0] = (fin ? 0x80 : 0x00) | (opcode & 0x0F);
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
199 frame_len++;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
200
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
201 // Generate a 4-byte mask key
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
202 uint8 mask_key[4];
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
203 for (int i = 0; i < 4; i++)
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
204 mask_key[i] = (uint8)(rand() % 256);
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
205
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
206 // Byte 1+: Payload Length logic
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
207 if (payload_length < 126) {
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
208 frame[1] = 0x80 | (uint8)payload_length; // 0x80 sets the MASK bit
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
209 frame_len++;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
210 } else if (payload_length <= 65535) {
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
211 frame[1] = 0x80 | 126;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
212 frame[2] = (uint8)((payload_length >> 8) & 0xFF);
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
213 frame[3] = (uint8)(payload_length & 0xFF);
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
214 frame_len += 3;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
215 } else {
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
216 frame[1] = 0x80 | 127;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
217 for (int i = 0; i < 8; i++)
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
218 frame[2 + i] = (uint8)((payload_length >> (56 - i * 8)) & 0xFF);
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
219 frame_len += 9;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
220 }
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
221
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
222 // Attach the mask key
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
223 memcpy(frame + frame_len, mask_key, 4);
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
224 frame_len += 4;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
225
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
226 ```
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
227
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
228 To actually send the data, you XOR every byte with the mask:
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
229
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
230 ```c
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
231 for (size_t i = 0; i < length; i++)
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
232 data[i] ^= mask_key[i % 4];
123
3f4ec30e42e0 Added blog files.
June Park <parkjune1995@gmail.com>
parents:
diff changeset
233
130
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
234 ```
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
235
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
236 ---
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
237
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
238 ## Receiving a Frame (Server Side)
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
239
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
240 On the server side, we have to do the reverse. We peel the onion layer by layer.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
241
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
242 #### 1. Parse the Header
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
243
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
244 We check the first two bytes to see how big the payload is and if it's masked.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
245
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
246 ```c
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
247 uint8 *buf = p_conn->p_handle->read_buffer;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
248 uint8 byte1 = buf[0];
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
249 uint8 byte2 = buf[1];
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
250
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
251 boolean fin = (byte1 & 0x80) != 0;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
252 Seobeo_WebSocket_Opcode opcode = (Seobeo_WebSocket_Opcode)(byte1 & 0x0F);
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
253 boolean masked = (byte2 & 0x80) != 0;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
254 uint64 payload_len = byte2 & 0x7F;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
255
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
256 size_t header_len = 2;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
257
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
258 ```
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
259
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
260 #### 2. Handle Extended Lengths
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
261
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
262 If the length is 126 or 127, it means the actual size is hidden in the next 2 or 8 bytes.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
263
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
264 ```c
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
265 if (payload_len == 126) {
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
266 payload_len = (buf[2] << 8) | buf[3];
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
267 header_len += 2;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
268 } else if (payload_len == 127) {
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
269 payload_len = 0;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
270 for (int i = 0; i < 8; i++)
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
271 payload_len = (payload_len << 8) | buf[2 + i];
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
272 header_len += 8;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
273 }
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
274
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
275 ```
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
276
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
277 #### 3. Unmask the Payload
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
278
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
279 If the data is masked (and it should be if it's from a client), we use that 4-byte key to flip the bits back to normal.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
280
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
281 ```c
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
282 uint8 mask_key[4] = {0};
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
283 if (masked) {
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
284 memcpy(mask_key, buf + header_len, 4);
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
285 header_len += 4;
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
286 }
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
287
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
288 uint8 *payload = malloc(payload_len);
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
289 memcpy(payload, buf + header_len, payload_len);
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
290
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
291 if (masked)
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
292 Seobeo_WebSocket_Unmask_Data(payload, payload_len, mask_key);
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
293
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
294 ```
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
295
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
296 ---
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
297
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
298 ## Conclusion
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
299
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
300 That’s it. That is WebSockets in a nutshell. Once you handle the bit-shifting for the length and the XOR masking, you’re just reading and writing to a socket like any other protocol.
3a564ffb2092 Wrote my blog.
June Park <parkjune1995@gmail.com>
parents: 123
diff changeset
301
133
902e29c38d66 [Blog] Final copy for websocket one.
June Park <parkjune1995@gmail.com>
parents: 132
diff changeset
302 You can test my implementation at [this page](https://mrjunejune.com/talk). Open two tabs and talk to yourself!