In the PCI Express (PCIe) Data Link Layer, we know that when a receiver detects a corrupted or out-of-sequence Transaction Layer Packet (TLP), it immediately discards the packet and sends a Negative Acknowledge (Nak) back to the transmitter. This Nak demands that the transmitter stop what it is doing and replay the missing data.
But what happens in the brief period between the receiver sending the Nak and the replayed packets finally arriving? Without a specific mechanism to manage this waiting period, the system could easily collapse into an endless loop of errors.
Enter the NAK_SCHEDULED flag—a critical internal safeguard that forces the receiver to patiently wait for its requested data without clogging the link with redundant complaints. Here is how it works.
The Danger of Endless Nak Loops
To understand why the NAK_SCHEDULED flag is so important, we must look at how high-speed pipelines operate. If a receiver detects that packet 30 is corrupted, it sends a Nak. However, because data moves incredibly fast, packets 31, 32, and 33 might already be traveling across the physical wire right behind packet 30.
Because PCIe enforces strict in-order delivery, the receiver must reject packets 31, 32, and 33 since packet 30 is missing. If the receiver sent a brand new Nak for every single one of those rejected packets, it would bombard the transmitter with error messages.
Worse yet, the transmitter might begin replaying packet 30, only to receive a new Nak from the receiver (caused by packet 32), which would force the transmitter to abort its current replay and start the replay process all over again. This would create an endless loop where the transmitter is constantly forced to restart its replays, permanently stalling the link.
The Solution: Edge-Triggered Errors
To prevent this catastrophic loop, the receiver relies on the internal NAK_SCHEDULED flag.
When the receiver detects a bad TLP, it discards the packet and checks the status of the NAK_SCHEDULED flag. If the flag is currently clear, the receiver sets it, which triggers the generation of a single Nak DLLP.
The PCIe specification dictates that scheduling a Nak behaves like an “edge-triggered” event. It is the rising edge—the exact moment the flag transitions from clear to set—that causes the Nak to be scheduled. Once this flag is set, the receiver is strictly forbidden from scheduling any additional Nak DLLPs.
The Waiting Game
While the NAK_SCHEDULED flag remains active, the receiver enters a stubborn waiting state. During this time:
- All new TLPs are discarded: Even if perfectly good TLPs arrive, the receiver silently drops them without accepting them.
- Total Silence: The receiver absolutely will not schedule any additional Acks or Naks, no matter how many out-of-sequence packets it is forced to drop.
By maintaining this silence, the receiver guarantees the transmitter has the uninterrupted time it needs to clear its pipeline, process the initial Nak, and successfully execute the replay of the Replay Buffer.
Clearing the Flag and Resuming Traffic
Because scheduling a Nak is an edge-triggered event, another Nak can never be sent until the system experiences a falling edge—meaning the NAK_SCHEDULED flag must be cleared first.
There are only two events capable of clearing this flag:
- Successful Delivery: The primary way the flag is cleared is when the receiver finally receives the exact replayed TLP it has been waiting for (a packet whose Sequence Number perfectly matches the receiver’s
NEXT_RCV_SEQexpected count). - A Link Reset: If the system is forced to undergo a hard reset of the link, the flag will also be cleared to start fresh.
Summary The NAK_SCHEDULED flag is a brilliant mechanism for managing the chaos of high-speed error recovery. By acting as a strict toggle that allows only one Nak to be sent per error event, it prevents a barrage of redundant error messages and ensures the system never gets trapped in an endless loop of restarted replays.
