Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Flow Control

Flow control prevents a fast sender from overwhelming a slow receiver. Without it, a server could blast data faster than your application can process it, leading to lost data and wasted retransmissions.

The Problem

Consider a file download:

Fast Server                           Slow Client
(100 Mbps)                            (processes 1 MB/s)

   │──── 1 MB ────────────────────────>│ Buffer: [1MB]
   │──── 1 MB ────────────────────────>│ Buffer: [2MB]
   │──── 1 MB ────────────────────────>│ Buffer: [3MB]
   │──── 1 MB ────────────────────────>│ Buffer: [4MB] ← FULL!
   │──── 1 MB ────────────────────────>│ Buffer: OVERFLOW!
   │                                   │
   │  Data lost! Must retransmit.      │
   │  Waste of bandwidth.              │

Without flow control, fast senders can:
- Overflow receiver buffers
- Cause packet loss
- Trigger unnecessary retransmissions

The Sliding Window

TCP uses a sliding window mechanism for flow control. The receiver advertises how much buffer space is available, and the sender limits itself accordingly.

Receive Window (rwnd)

Receiver's perspective:

Receive Buffer (size: 65535 bytes)
┌─────────────────────────────────────────────────────────────┐
│ Data waiting │ Application    │      Available Space        │
│ to be read   │ reading...     │       (Window)              │
│    (ACKed)   │                │                             │
├──────────────┼────────────────┼─────────────────────────────┤
│    10000     │    (consuming) │         55535               │
└──────────────┴────────────────┴─────────────────────────────┘

Window advertised to sender: 55535 bytes
"I can accept 55535 more bytes"

Sender’s View

The sender tracks three pointers:

Sent & ACKed │ Sent, waiting for ACK │   Can send   │ Cannot send
             │                       │   (Window)   │ (beyond window)
─────────────┴───────────────────────┴──────────────┴────────────────
    1000         1000-5000              5000-10000      10000+

The "window" slides forward as ACKs arrive:

Before ACK:
[=====Sent=====][=====In Flight=====][===Can Send===][  Cannot  ]
                                     └──rwnd=5000───┘

After ACK (receiver consumed data):
        [===Sent===][===In Flight===][=====Can Send=====][Cannot]
                                     └─────rwnd=8000─────┘

Window "slides" right as data is acknowledged

Window Flow

Let’s trace a file transfer with flow control:

Sender                                              Receiver
   │        rwnd = 4000                                │
   │                                                   │
   │──── Seq=1000, 1000 bytes ────────────────────────>│
   │──── Seq=2000, 1000 bytes ────────────────────────>│
   │──── Seq=3000, 1000 bytes ────────────────────────>│
   │──── Seq=4000, 1000 bytes ────────────────────────>│
   │                                                   │
   │    (Sender has sent rwnd bytes, must wait)        │
   │                                                   │
   │<──── ACK=5000, Win=2000 ──────────────────────────│
   │      (App read 2000 bytes, 2000 space freed)      │
   │                                                   │
   │──── Seq=5000, 1000 bytes ────────────────────────>│
   │──── Seq=6000, 1000 bytes ────────────────────────>│
   │                                                   │
   │    (Window full again, wait)                      │
   │                                                   │
   │<──── ACK=7000, Win=4000 ──────────────────────────│
   │      (App caught up, more space)                  │

Window Size and Throughput

The window limits throughput based on latency:

Maximum throughput = Window Size / Round Trip Time

Example 1: Window=65535, RTT=10ms
  Throughput ≤ 65535 / 0.010 = 6.5 MB/s

Example 2: Window=65535, RTT=100ms
  Throughput ≤ 65535 / 0.100 = 655 KB/s

This is why window scaling matters for high-latency links!

Bandwidth-Delay Product (BDP)

For optimal throughput, window should be ≥ BDP:

BDP = Bandwidth × RTT

Example: 100 Mbps link, 50ms RTT
  BDP = 100,000,000 bits/s × 0.050 s
      = 5,000,000 bits = 625,000 bytes

Need window ≥ 625 KB to fully utilize the link!
Standard 65535-byte window is way too small.
Window scaling essential: 65535 × 2^4 = ~1MB

Window Scaling

Window scaling multiplies the 16-bit window field:

Without scaling:
  Max window = 65535 bytes
  On 100Mbps, 50ms link: 65535/0.050 = 1.3 MB/s (10% utilization)

With scale factor 7:
  Max window = 65535 × 128 = 8.3 MB bytes
  On 100Mbps, 50ms link: 8.3M/0.050 = 166 MB/s (full utilization)

Negotiated during handshake:
  SYN: WScale=7
  SYN-ACK: WScale=8

Scale applies to window field in all subsequent segments

Zero Window

When the receiver’s buffer is full, it advertises window = 0:

Sender                                              Receiver
   │                                                   │
   │<──── ACK=5000, Win=0 ────────────────────────────│
   │      "My buffer is full, stop sending!"          │
   │                                                   │
   │    (Sender stops, starts "persist timer")        │
   │                                                   │
   │──── Window Probe (1 byte) ──────────────────────>│
   │                                                   │
   │<──── ACK=5000, Win=0 ────────────────────────────│
   │      (Still full)                                │
   │                                                   │
   │    (Wait, probe again)                           │
   │                                                   │
   │──── Window Probe (1 byte) ──────────────────────>│
   │                                                   │
   │<──── ACK=5000, Win=4000 ─────────────────────────│
   │      (Space available, resume!)                  │

Persist Timer

The persist timer prevents deadlock when window = 0:

Without persist timer:
  Receiver: Window=0 (buffer full)
  Sender: Stops sending, waits for window update
  Receiver: Window update packet is lost!
  Both sides wait forever → Deadlock

With persist timer:
  Sender periodically probes with 1-byte segments
  Eventually receives window update
  No deadlock possible

Silly Window Syndrome

A pathological condition where tiny windows cause inefficiency:

Problem scenario:
  Application reads 1 byte at a time
  Receiver advertises 1-byte window
  Sender sends 1-byte segments (huge overhead!)

1 byte payload + 20 TCP + 20 IP = 41 bytes
Efficiency: 1/41 = 2.4%

This is "Silly Window Syndrome" (SWS)

Prevention

Receiver side (Clark’s algorithm):

Don't advertise tiny windows.
Wait until either:
  - Window ≥ MSS, or
  - Window ≥ buffer/2

"I have space" → If space < MSS, advertise Win=0

Sender side (Nagle’s algorithm):

Don't send tiny segments.
If there's unacknowledged data:
  - Buffer small writes
  - Wait for ACK before sending

Can be disabled with TCP_NODELAY socket option
(Important for latency-sensitive apps)

Flow Control in Action

Here’s a real-world example captured with tcpdump:

Time    Direction  Seq      ACK      Win    Len
──────────────────────────────────────────────────
0.000   →          1        1        65535  1460   # Send data
0.001   →          1461     1        65535  1460   # More data
0.050   ←          1        2921     32768  0      # ACK, window shrunk
0.051   →          2921     1        65535  1460   # Continue
0.052   →          4381     1        65535  1460
0.100   ←          1        5841     16384  0      # Window shrinking
0.101   →          5841     1        65535  1460
0.150   ←          1        7301     0      0      # ZERO WINDOW!
0.650   →          7301     1        65535  1      # Window probe
0.700   ←          1        7302     8192   0      # Window opened
0.701   →          7302     1        65535  1460   # Resume

Tuning Flow Control

Receiver Buffer Size

# Linux - check current buffer sizes
$ sysctl net.core.rmem_default
net.core.rmem_default = 212992

$ sysctl net.core.rmem_max
net.core.rmem_max = 212992

# Increase for high-bandwidth applications
$ sudo sysctl -w net.core.rmem_max=16777216
$ sudo sysctl -w net.ipv4.tcp_rmem="4096 87380 16777216"
#                                   min  default  max

Application-Level Control

import socket

# Create socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

# Set receive buffer (affects advertised window)
s.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 1048576)  # 1MB

# Check actual buffer size (OS may adjust)
actual = s.getsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF)
print(f"Receive buffer: {actual}")

Visualizing the Window

Receiver's buffer over time:

Time=0 (empty buffer, large window)
┌────────────────────────────────────────────────────────────┐
│                        Available (Win=64KB)                │
└────────────────────────────────────────────────────────────┘

Time=1 (receiving faster than app reads)
┌───────────────────────────┬────────────────────────────────┐
│    Buffered (32KB)        │      Available (Win=32KB)      │
└───────────────────────────┴────────────────────────────────┘

Time=2 (app not reading, buffer filling)
┌───────────────────────────────────────────┬────────────────┐
│           Buffered (48KB)                 │ Avail(Win=16KB)│
└───────────────────────────────────────────┴────────────────┘

Time=3 (buffer full!)
┌────────────────────────────────────────────────────────────┐
│                    Buffered (64KB) - Win=0!                │
└────────────────────────────────────────────────────────────┘

Time=4 (app reads 32KB)
┌───────────────────────────┬────────────────────────────────┐
│    Buffered (32KB)        │      Available (Win=32KB)      │
└───────────────────────────┴────────────────────────────────┘

Summary

Flow control ensures receivers aren’t overwhelmed:

MechanismPurpose
Receive Window (rwnd)Advertises available buffer space
Window ScalingEnables windows > 65535 bytes
Zero WindowSignals “stop sending”
Persist TimerPrevents deadlock on zero window
Nagle’s AlgorithmPrevents sending tiny segments
Clark’s AlgorithmPrevents advertising tiny windows

Key formulas:

Max throughput = Window / RTT
BDP = Bandwidth × RTT (optimal window size)

Flow control handles receiver capacity. But what about the network itself? That’s congestion control—our next topic.