Skip to main content

Command Palette

Search for a command to run...

[Day 7] The Process Went to Sleep — and the Kernel Kept Working

Updated
12 min read

In Day 6, I watched a single connection arrive at the server and saw two fds show up instead of one. The listening socket (fd 3) stayed at the door, and a new connected socket (fd 4) was created for the guest.

That gave me a mental model. But it also left a quiet assumption I hadn't really tested:

"My server program accepts the connection."

Does it, though? Today I wanted to pull on that thread — what happens when the process is asleep, or dies, or never calls accept()? Who's actually doing the work?

Turns out the answer reshaped how I think about sockets entirely.


1. Starting Where Day 6 Left Off

Back to the same setup as Day 6:

Terminal 1 (server):

$ nc -l 9999

Terminal 3 (client):

$ nc localhost 9999

Terminal 2 (observer):

$ ss -tan | grep 9999
LISTEN  0  1  0.0.0.0:9999       0.0.0.0:*
ESTAB   0  0  127.0.0.1:9999     127.0.0.1:45828
ESTAB   0  0  127.0.0.1:45828    127.0.0.1:9999

Three rows. The first is the listener (server's fd 3). The other two are the two ends of the connection — one from the server's perspective, one from the client's. Same connection, two socket objects, two rows. (Day 6 territory.)

Now the experiment. What if I kill the client with Ctrl+C?


2. The Client Died — but a Socket Remained

Hit Ctrl+C in terminal 3. Then immediately re-check:

$ ss -tan | grep 9999
TIME-WAIT  0  0  127.0.0.1:45828  127.0.0.1:9999

Wait, what?

  • The LISTEN row is gone → because nc -l exits when its client disconnects (nc's default behavior; would need -k for a persistent listener)

  • The server's ESTAB is gone → server process is dead

  • But one row remains: 127.0.0.1:45828 → 127.0.0.1:9999. And it's in state TIME-WAIT.

Reading the direction: 45828 → 9999. That's the client's perspective. The client — which I just killed with Ctrl+C — somehow has a socket still sitting in the kernel.

The process is dead. The socket is alive. How?


3. Who Owns the Socket?

This is where Day 6's framing came back and bit me.

In Day 6 I wrote that sockets don't exist on disk — they live only in kernel memory as kernel data structures. I said it, but I'd been quietly thinking of sockets as "things my process owns." Because that's how fds feel — they're numbers my process uses, right?

But look at what just happened. The process is gone. The fd is definitely gone (fds die with the process). And yet the socket is still there, still in the kernel's connection table, still counted by ss.

So the socket isn't owned by the process at all. It's owned by the kernel.

The process is just a guest who said "hey kernel, can you manage a socket for me, and let me refer to it by fd number 4?" When the guest leaves, the socket may or may not be cleaned up — that's the kernel's call, not the process's.

fd is a nickname the process uses. The socket itself is the kernel's asset.

This is a small rephrasing of Day 6's "fd = nickname, inode = SSN" insight, but it changes the emotional texture. Ownership isn't with the process. It never was.


4. So Why TIME_WAIT? The 4-way Handshake Tells the Story

Right, but why does the kernel hold on to the socket after the process dies? Why not just clean it up immediately?

This is where TCP's connection teardown matters. Closing a TCP connection takes four messages, not two, and the two sides aren't symmetric.

Let's call the side that closes first A (the client, in our case), and the other side B.

A → B:  FIN           "I'm done sending."
A ← B:  ACK           "Got it."
A ← B:  FIN           "Same, I'm done too."
A → B:  ACK           "Got it. Bye."

Four messages. A's final ACK is the last thing A sends.

Now imagine that last ACK is lost in the network. What happens?

  • B never got the ACK, so B thinks "did my FIN even arrive?" and retransmits the FIN.

  • That retransmitted FIN arrives at A.

  • A needs to be able to respond to it (resend the ACK).

But if A has already destroyed its socket, who's there to answer? The packet hits a ghost — or worse, hits a brand new connection that happened to reuse the same port, and confuses it.

So A's kernel does this:

"I'll hold onto this socket for a while. Just in case B's FIN comes in late, I can still send a final ACK. No user process needs to be involved — this is my job."

That "while" is called TIME_WAIT.


5. Why 2 × MSL Specifically

MSL = Maximum Segment Lifetime. The upper bound on how long a single packet can wander the internet before being dropped. Defined by RFC at 2 minutes; Linux uses 30s or 60s depending on the kernel.

Why do we wait 2 × MSL in TIME_WAIT?

Worst case:

t=0          A sends final ACK.
t=1 × MSL    (If ACK is lost somewhere in transit, it's expired by now.)
             B eventually gives up and retransmits FIN.
t=2 × MSL    That retransmitted FIN arrives at A at the latest.

One MSL to cover the outgoing ACK's possible lifetime, one MSL for the potential retransmitted FIN coming back. 2 × MSL = round trip upper bound. After that, nothing legitimate from this old connection should ever show up again.

Linux exposes this timeout:

$ cat /proc/sys/net/ipv4/tcp_fin_timeout
60

60 seconds on my system. And I can actually watch the countdown:

$ ss -tan -o | grep TIME-WAIT
TIME-WAIT  0  0  ...  timer:(timewait,45sec,0)
TIME-WAIT  0  0  ...  timer:(timewait,30sec,0)
TIME-WAIT  0  0  ...  timer:(timewait,15sec,0)
TIME-WAIT  0  0  ...  timer:(timewait,288ms,0)

Four TIME_WAIT sockets, each with their own timer ticking down. When the timer hits zero, the kernel finally frees the socket.


6. Why This Matters in Production

Academic, until you do the math.

Say a service opens 1000 connections per second to a single upstream API (api.example.com:443). No connection pooling. Each connection finishes, each ends up in TIME_WAIT for 60 seconds.

Concurrent TIME_WAIT sockets = 1000 × 60 = 60,000

Is that a problem? Well:

$ cat /proc/sys/net/ipv4/ip_local_port_range
32768  60999

Available ephemeral ports: 60999 - 32768 = 28,231.

28,231 available ports. 60,000 needed.

Impossible. Before you hit 60K, you'll run out of local ports and start seeing:

cannot assign requested address

A tiny subtlety here: TCP connection uniqueness is actually determined by the 4-tuple (src_ip, src_port, dst_ip, dst_port). So if your destinations vary (different IPs or ports), you can reuse the same local port. But when you're hammering a single destination, only src_port varies — and there are 28,231 of those.

This is why connection pooling isn't just a performance thing. Without it, you physically cannot maintain that connection rate.


7. Plot Twist: Python Disassembles accept() for Me

OK, so the process doesn't own the socket. The kernel does. How far does that go?

Let me take it apart with Python, one syscall at a time. Two terminals.

Terminal A (Python REPL):

>>> import socket
>>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

Terminal B (observer):

$ ss -tanp | grep 9999
# ...nothing

socket() alone shows nothing. Of course — no port has been bound yet.

>>> s.bind(('127.0.0.1', 9999))
$ ss -tanp | grep 9999
UNCONN  0  0  127.0.0.1:9999  0.0.0.0:*  users:(("python3",pid=3398559,fd=5))

There it is. State: UNCONN — "I have an address, but I'm not doing anything with it yet." The socket now has a name, but it's not accepting anything.

>>> s.listen(5)
$ ss -tanp | grep 9999
LISTEN  0  5  127.0.0.1:9999  0.0.0.0:*  users:(("python3",pid=3398559,fd=5))

UNCONN → LISTEN. And notice the 5 in Send-Q — that's the argument I passed to listen(5). It's the accept queue capacity. This is the waiting bench Day 6 implied but I didn't see laid bare.


8. The Experiment That Broke My Assumption

Here's the test. I did not call accept() yet. The Python REPL is just sitting at the prompt.

From terminal C, connect as a client:

$ nc localhost 9999

Then terminal B:

$ ss -tanp | grep 9999
LISTEN  1  5  127.0.0.1:9999    0.0.0.0:*           users:(("python3",pid=3398559,fd=5))
ESTAB   1  0  127.0.0.1:9999    127.0.0.1:43904     users:(("python3",pid=3398559,fd=5))
ESTAB   0  0  127.0.0.1:43904   127.0.0.1:9999      users:(("nc",  pid=3401788,fd=3))

My server is still at the Python prompt. Hasn't called accept(). Hasn't executed a single byte of code.

And yet:

  • The 3-way handshake is complete (ESTAB on both sides)

  • A connection exists in the kernel

  • The LISTEN row's Recv-Q went from 0 → 1one completed connection waiting on the bench

I honestly expected this to fail. I thought "the server hasn't accepted, so the connection shouldn't succeed." That's wrong.

The kernel handles the 3-way handshake on its own. It doesn't need my process to be awake, let alone to have called accept(). When the handshake completes, the kernel stashes the resulting socket in the accept queue (that bench) and just waits.

Now call accept():

>>> conn, addr = s.accept()
>>> conn.fileno()
6
$ ss -tanp | grep 9999
LISTEN  0  5  ...
ESTAB   0  0  ...
ESTAB   0  0  ...

The LISTEN row's Recv-Q drops from 1 → 0. The bench is empty again. The socket that was sitting there has been handed to my process as fd 6.

accept() doesn't accept the connection. The kernel already accepted it. accept() is just the process saying "okay, give me the next one off the bench." It's a pickup function, not a handshake function.

That rearranges everything. The function name is a lie — a polite one, but a lie.


9. One More Wake-Up: recv() Is the Same

I left the REPL idle for a minute. Came back to:

ESTAB  25  0   127.0.0.1:9999  127.0.0.1:43904  users:(("python3",...,fd=6))

Recv-Q = 25.

On an ESTAB socket, Recv-Q means something different from on LISTEN:

  • LISTEN's Recv-Q: number of connections on the bench

  • ESTAB's Recv-Q: bytes sitting in the receive buffer that nobody has read

Same column name, completely different meaning. Confusing, but once you know, it's a gift — two diagnostics for the price of one.

So what happened? I'd been typing gibberish into the client nc while I wasn't looking. Those keystrokes were sent over TCP, arrived at my server's kernel, and the kernel put them in the receive buffer. My Python process, meanwhile, had called accept() but not recv(). So the data just piled up.

>>> data = conn.recv(1024)
>>> print(len(data), data)
25 b'\nddd\x1b[27;2;13~\n\ndddd\nddd\n'

Exactly 25 bytes. The \x1b[27;2;13~ in the middle is an ANSI escape sequence from a special key I'd pressed — nc forwards raw bytes, terminal escape codes and all. The server doesn't parse or care; it just gets whatever was in the buffer.

Check the buffer after:

$ ss -tanp | grep 9999
ESTAB  0  0  127.0.0.1:9999  127.0.0.1:43904  ...

Recv-Q back to 0. My recv() call drained it.

Same pattern as accept(). The kernel is the one receiving packets and filling buffers. recv() just transfers bytes from the kernel's buffer into my process's memory.


10. The Pattern

Three syscalls, one shape:

Syscall What the kernel does in the background What the syscall actually does
accept() 3-way handshake, queues the new socket Pulls a socket from the queue, assigns an fd
recv() Receives packets, fills the receive buffer Copies bytes from the buffer into process memory
(TIME_WAIT) Keeps the socket alive after FIN, to replay ACK — (process has no say; it's long gone)

The names make you think the syscall is doing the network work. It isn't. The kernel does the network work. The syscall is a pickup window where the process collects what the kernel has already prepared.

Once you see it this way, a bunch of things reframe:

  • "Why is my server slow?" → Is it slow at picking up, or is the kernel failing to fill the buffer?

  • "Accept queue overflow" → The bench filled up because the process isn't picking up fast enough.

  • "Too many open files" → Too many fds handed out, regardless of whether they're being read from.

The kernel doesn't wait for the process. The process waits for the kernel.


Quick Reference

State transitions of a socket during server lifecycle:

socket()  →  (no state — just an fd pointing to an empty socket object)
bind()    →  UNCONN   (has address, no role yet)
listen()  →  LISTEN   (accepting connections, with a queue)
accept()  →  ESTAB    (a new socket pulled from the queue, given an fd)
close()   →  (various teardown states; may include TIME_WAIT for 2 × MSL)

Reading ss:

ss -tan                    # all TCP sockets
ss -ltn                    # listening only
ss -tan -o                 # include TIME_WAIT timer countdown
ss -tanp                   # include process/pid/fd info (needs sudo for other users)

Recv-Q / Send-Q — their meaning depends on state:

State Recv-Q Send-Q
LISTEN completed connections waiting for accept max backlog (listen(N) argument)
ESTAB unread bytes in receive buffer unsent bytes in send buffer

Mental model:

The kernel owns the socket.
The process is a guest with a pickup number (fd).

The process being asleep, or even dead, doesn't stop the kernel
from doing handshakes, receiving packets, or holding sockets in TIME_WAIT.

Syscalls like accept() and recv() aren't "do the network thing."
They're "give me whatever you've already prepared."

TIME_WAIT math in production:

connections/sec × 2×MSL = concurrent TIME_WAIT sockets

If this exceeds ephemeral port range AND you're hammering a single destination,
you'll hit: "cannot assign requested address"

Fix: connection pooling. Not a performance tweak — a physical requirement.