R 4.2 : Inside the Transmitter: The PCIe Replay (Retry) Buffer Explained

In the PCI Express (PCIe) Data Link Layer, the robust Ack/Nak protocol is entirely dependent on its ability to rescue and re-send corrupted data. The hardware component that makes this safety net possible is the Replay Buffer (which the official PCIe specification refers to as the Retry Buffer).

Before a device transmits a Transaction Layer Packet (TLP) across the physical link, it must prepare for the worst by keeping a backup. Here is an inside look at how the Replay Buffer stores these packets and the careful sizing considerations hardware engineers must make when designing it.

Storing the Complete TLP

Before a TLP is transmitted, a complete copy is placed into the Replay Buffer in the exact order of transmission.

The buffer does not just hold the data payload; each entry stores the entire packet exactly as it was formatted for the wire. This includes:

  • The 12-bit Sequence Number (occupying 2 bytes).
  • The TLP Header (up to 16 bytes).
  • The optional Data Payload (which can be up to 4KB).
  • The optional ECRC (4 bytes).
  • The calculated 32-bit LCRC (4 bytes).

While the PCIe specification describes this specific buffer architecture, it actually leaves the exact internal implementation up to the designer, so long as the device is capable of replaying a sequence of TLPs when required.

Sizing Considerations: Balancing Efficiency and Cost

Interestingly, the PCIe specification deliberately chooses not to mandate a specific size for the Replay Buffer. Instead, determining the optimal size is left as an engineering challenge.

Designers must strike a delicate balance between link bandwidth efficiency and hardware costs:

  • The Risk of Being Too Small: If the buffer is too small, it will quickly fill up with unacknowledged packets. When full, it will stall new TLPs arriving from the Transaction Layer, throttling system performance.
  • The Risk of Being Too Large: Memory takes up valuable physical space on the silicon. Making the buffer excessively large drives up manufacturing costs unnecessarily.

To find the perfect “sweet spot” for buffer size, hardware designers must calculate for three critical delays:

  1. Ack DLLP Latency: The amount of time it takes for the receiving device to process the TLPs and return an Ack DLLP.
  2. Physical Link Delays: The transit time required for the packet to cross the physical wire.
  3. L0s Exit Latency: The buffer should be large enough to hold outgoing TLPs without stalling the system while the link wakes up from the L0s low-power state and transitions back to the active L0 state.

Buffer Management: Purging and Replaying

The Replay Buffer is a highly dynamic environment, constantly filling and emptying based on the local traffic (Acks and Naks) arriving from the receiver.

  • When an Ack Arrives: If an Ack is received, the transmitter instantly purges all TLPs from the buffer that have a Sequence Number equal to or earlier than the one specified in the Ack. Because the golden rule of PCIe requires sequential ordering, a single Ack can act as a bulk receipt, purging several successful TLPs from the buffer at once to free up space.
  • When a Nak Arrives: A Nak indicates a transmission error. The transmitter will still purge the buffer of successfully delivered TLPs up to the Sequence Number in the Nak. However, it must then halt new traffic and replay (re-send) all remaining TLPs left in the buffer, starting with the packet immediately following the sequence number in the Nak.

Summary The Replay Buffer is the vital holding pen for PCIe data in transit. By carefully optimizing its size to accommodate link delays without overspending on silicon, developers ensure the transmitter can seamlessly recover from transient link errors without ever dropping a byte of critical data.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top