Skip to main content

Command Palette

Search for a command to run...

[Day 4] Why ls Doesn't Know Where Its Output Goes

Updated
6 min read

After typing ls > out.txt hundreds of times, a strange question hit me:

Does ls somehow detect that its output is being redirected? Does it have some logic like "oh, output is going to a file now, let me switch modes"?

I spent my afternoon digging into this, and the answer turned out to be surprisingly deep. Here's what I found.


1. The Question Nobody Asks

When you run these two commands:

ls                  # prints to screen
ls > out.txt        # writes to file

ls behaves the same from the outside. But what's happening inside? Is ls checking some flag? Running different code paths?

The answer is no. And understanding why leads straight into how Unix really works.


2. File Descriptors: Not What I Thought

I used to vaguely think of stdin, stdout, stderr as some kind of memory buffers. They're not.

A file descriptor is just an integer — an index into the process's file descriptor table. Each process has its own table that maps these numbers to actual resources (files, terminals, pipes, etc.):

[My process's FD table]
 fd  │  points to
─────┼──────────────
  0  →  terminal (keyboard input)
  1  →  terminal (screen output)
  2  →  terminal (error output)
  3  →  /home/user/log.txt

Key insight: fds belong to the process, not to the files or pipes themselves. When a child process closes its fd 3, the parent's fd 3 is unaffected — they're separate tables.


3. The fork/exec Dance

Here's what actually happens when you run ls > out.txt:

  1. Shell calls fork() — a child process is cloned from the parent shell. The child inherits a copy of the shell's fd table.

  2. The child sets up the redirection:

    • (a) opens out.txt with open(), which returns a new fd (say, fd 3)

    • (b) duplicates that fd onto fd 1 with dup2(3, 1) — now fd 1 points to out.txt

    • (c) closes the original fd 3, since it's no longer needed

  3. The child calls exec("ls") — it transforms into ls, but the fd table survives the transformation.

  4. ls writes to fd 1 as it normally would. But fd 1 already points to out.txt, and ls has no idea this change happened.

  5. When ls exits, the shell reaps the child with wait().

The crucial detail: the file is opened and wired to fd 1 BEFORE exec runs.

Why? Because after exec, there's no one left to set it up. ls is just a normal program — it doesn't contain any logic like "check if my output should go to a file." The only window to wire up fd 1 is the tiny moment between fork and exec, when the child still has shell-like control over its own process state.


4. So Back to the Original Question

Does ls know its output is going to a file?

No. ls just calls write(1, ...) — it writes to fd 1, blindly. It doesn't know, doesn't check, doesn't care whether fd 1 points to a terminal, a file, or a pipe.

The redirection happened in the child process, before ls even started running. By the time ls exists, fd 1 is already pointing to out.txt. ls just does its normal thing.

ls itself runs exactly the same as always. It simply doesn't know that fd 1 was rewired to a file.

This is the whole point of Unix I/O design: programs don't know where their I/O goes. That's what makes them composable.


5. Pipes Are the Same Idea, Scaled Up

Once you see redirection this way, pipes stop feeling magical.

When you run ps aux | grep bash:

  1. The shell calls pipe() — the kernel creates a pipe object with two ends: a write end and a read end. The shell gets two fds pointing to them.

  2. The shell forks twice — creating Child A (future ps) and Child B (future grep). Both children inherit copies of the shell's fd table, so both have access to both ends of the pipe.

  3. Child A wires its fd 1 to the pipe's write end using dup2, closes the ends it doesn't need, then execs ps.

  4. Child B wires its fd 0 to the pipe's read end using dup2, closes the ends it doesn't need, then execs grep.

  5. Data flows: ps writes to fd 1 → kernel pipe buffer → grep reads from fd 0. When ps exits and all write ends are closed, grep receives EOF and finishes too.

Why does the shell create the pipe, not the children?

Because fork() only creates a parent-child relationship. The two children have no knowledge of each other — they're siblings, not connected directly. So any shared resource between them must be created by their common ancestor (the shell) before they're born, so both children can inherit it through fork.

This is a general principle in Unix: siblings communicate through resources their parent prepared for them.


6. The Takeaway That Changes Everything

Neither ls, ps, nor grep knows anything about redirection or pipes. They just read from fd 0 and write to fd 1.

The environment (fd table) is what gets rewired — by the shell, before exec. This is why you can write:

cat access.log | grep "404" | wc -l > count.txt

...and chain four programs together that were never designed to know about each other. Each program is dumb. The shell is the conductor.

This principle will show up again — in containers, in network sockets, in systemd services. Programs don't know their environment. That's a feature, not a bug.


Quick Reference

Concept What it is
File descriptor An integer index into the process's fd table
fork() Copies the parent process, including its fd table
dup2(src, dst) Makes dst point to the same thing as src
exec() Replaces the process's code, but keeps the fd table
Pipe A kernel object with a write end and a read end
EOF on a pipe Sent to the read end when ALL write ends are closed

Order always matters: fork → set up fds (open, dup2, close) → exec. Never the other way around.


Tomorrow: top, jobs/bg/fg, and process control — how to watch and steer running processes.

2 views