PCIe Series — PCIe-04: PCIe Generations Gen 1 to Gen 6 — VLSI Trainers
PCIe Series · PCIe-04

PCIe Generations — Gen 1 to Gen 6

The bandwidth maths behind every generation, why each speed step chose its encoding scheme, what physical constraints drove each decision, and a deep dive into Gen 6’s PAM4 and flit architecture with worked numbers.

📈 All Six Generations at a Glance

PCIe Generations — Bandwidth Evolution Per Lane (Each Direction) GB/s per lane per direction 250 MB/s 500 MB/s ~1 GB/s ~2 GB/s ~4 GB/s ~8 GB/s (2× Gen 5) Gen 1 2.5 GT/s · 8b/10b Gen 2 5.0 GT/s · 8b/10b Gen 3 8 GT/s · 128b/130b Gen 4 16 GT/s · 128b/130b Gen 5 32 GT/s · 128b/130b Gen 6 64 GT/s · PAM4+FEC Key Insight Each generation ≈ doubles per-lane bandwidth. Gen 1→2: 2× by doubling the frequency. Gen 2→3: 2× by changing encoding (not freq!). Gen 3→4→5: 2× by doubling frequency again. Gen 5→6: 2× by changing modulation (PAM4) — NOT by increasing frequency. NRZ cannot exceed ~32 GT/s
Figure 1 — Per-lane bandwidth by generation (each direction). The doubling mechanism changes: Gen 1→2 doubles the baud rate, Gen 2→3 changes encoding, Gen 3–5 double the baud rate again, Gen 5→6 changes modulation to PAM4. The bar chart is not to scale above Gen 5 — Gen 6 would be twice the height of Gen 5.

📋 How Bandwidth Is Calculated

The raw line rate (GT/s = Giga-Transfers per second) is the number of symbol transitions on the wire per second. That is not the same as data throughput — the encoding overhead must be subtracted.

Bandwidth Calculation — Step by Step Gen 1 — 8b/10b at 2.5 GT/s (x1 lane) Step 1: Raw line rate = 2.5 GT/s = 2.5 × 10⁹ symbols/sec Step 2: 8b/10b → 8 data bits per 10 symbols → efficiency = 80% Step 3: 2.5 GT/s × 80% ÷ 8 bits/byte = 250 MB/s per lane Divide by 8 to convert Gb/s → GB/s x16 link: 250 MB/s × 16 lanes × 2 directions = 8 GB/s aggregate Gen 3 — 128b/130b at 8 GT/s (x1 lane) Step 1: Raw line rate = 8 GT/s = 8 × 10⁹ bits/sec (NRZ = 1 bit/symbol) Step 2: 128b/130b → efficiency = 128/130 = 98.46% ≈ 98.5% Step 3: 8 × 10⁹ × (128/130) ÷ 8 = ~985 MB/s ≈ 1 GB/s per lane Note: Gen 3 at 8 GT/s gives almost the same throughput as hypothetical Gen2-extended at 10 GT/s with 8b/10b (which would give 10×0.8÷8 = 1 GB/s) — same result, lower clock! Gen 6 (PAM4): 32 GBaud × 2 bits/symbol = 64 GT/s effective · After FEC (~3% overhead): ~61.8 Gb/s ≈ 7.7 GB/s per lane ≈ 8 GB/s rounded
Figure 2 — Bandwidth calculation for 8b/10b (Gen 1/2) and 128b/130b (Gen 3–5). For Gen 6, PAM4 contributes 2 bits per symbol at 32 GBaud, giving 64 GT/s effective, with FEC parity overhead reducing the actual payload throughput to approximately 8 GB/s per lane.

📋 Gen 1 — 2.5 GT/s and 8b/10b

The first PCIe generation launched with the spec in 2003. The target was simple: be software-compatible with PCI while beating its bandwidth on a modern serial link.

Why 2.5 GT/s?

The 2.5 GHz bit clock was chosen as the highest frequency achievable with 2001-era CMOS process technology and standard PCB materials (FR4) at acceptable signal integrity margins. The transmitter spec required a differential swing of approximately 800 mV peak-to-peak, which is achievable without exotic processes.

Why 8b/10b?

8b/10b encoding was already proven in other serial protocols (Fibre Channel, SATA, USB 3.0). It delivers DC balance (essential for AC-coupled links), guaranteed transition density for CDR, and clear packet framing using K-characters. The 20% overhead was an acceptable cost in 2003 when the alternative was a parallel bus with 20× less bandwidth.

Special symbols (K-characters)

8b/10b reserves specific 10-bit patterns as control symbols with no 8-bit equivalent. PCIe uses these for framing and link management:

SymbolNamePurpose
K28.5COMStart of an Ordered Set — tells receiver to start the sync/framing process. Used in TS1, TS2, SKIP ordered sets.
K27.7STPStart of TLP — marks the beginning of a Transaction Layer Packet on the wire
K28.2SDPStart of DLLP — marks the beginning of a Data Link Layer Packet
K29.7ENDEnd of packet — marks the end of TLP or DLLP
K23.7EDBEnd Bad — marks the end of a packet that was intentionally nullified by the transmitter
K28.0SKPSkip — used in the SKIP Ordered Set for elastic buffer clock compensation
The SKIP Ordered Set (K28.5 · K28.0 · K28.0 · K28.0) is transmitted every 1180–1538 symbol times. Its purpose is to compensate for the ±300 ppm clock frequency difference between two connected devices — the receiver’s elastic buffer adds or removes SKP symbols as needed to prevent buffer overflow or underflow. This is necessary because the transmitter and receiver use independent clocks, each within ±300 ppm of the target frequency.

Gen 1 numbers

ParameterValue
Line rate2.5 GT/s
Encoding8b/10b — 10-bit symbols, 8 data bits
Encoding efficiency80%
x1 lane throughput (each dir)250 MB/s
x16 aggregate (both dirs)8 GB/s
Symbol time4 ns
SKIP ordered set interval1180–1538 symbol times
Clock tolerance±300 ppm (max 600 ppm between TX and RX)

📋 Gen 2 — 5.0 GT/s, Same Encoding

Gen 2 (PCIe 2.0, 2007) doubled the line rate from 2.5 to 5.0 GT/s while keeping 8b/10b encoding. This was the simplest possible speed step — if the physical layer can clock twice as fast, you get twice the bandwidth. No encoding change required.

Why keep 8b/10b at 5 GT/s?

At 5 GT/s on FR4 PCB material, the channel loss is moderate but manageable with simple pre-emphasis at the transmitter. 8b/10b still works. The encoded symbols are wider in time (2 ns each) but the same framing and K-characters apply without modification. Changing the encoding would have broken backward compatibility with Gen 1 receivers.

Backward compatibility rule. A Gen 2 transmitter must fall back to Gen 1 speed when connected to a Gen 1 receiver — the LTSSM negotiates the highest common speed during link training. This works cleanly because the upper layers (TL, DLL) and software model are completely unchanged between Gen 1 and Gen 2.

What actually changed in Gen 2

📋 Gen 3 — Why 8 GT/s, Not 10

This is the question everyone asks. If Gen 1 is 2.5 GT/s and Gen 2 is 5.0 GT/s — why is Gen 3 at 8 GT/s instead of 10 GT/s? The answer is in the encoding change.

Gen 2 → Gen 3: Why Change Encoding Instead of Just Doubling Frequency? Option A — 10 GT/s with 8b/10b (rejected) Throughput = 10 GT/s × 80% ÷ 8 = 1.0 GB/s per lane Problem: 10 GHz on FR4 PCB — massive signal loss. Channel insertion loss at 5 GHz (Nyquist for 10 GT/s NRZ) typically 10–15 dB on 10-inch FR4 trace. Would require expensive low-loss PCB materials and would not work on existing motherboards. The PCISIG’s design goal was FR4 compatibility. Option B — 8 GT/s with 128b/130b (chosen ✓) Throughput = 8 GT/s × (128/130) ÷ 8 = ~985 MB/s per lane Result is almost identical to Option A (985 vs 1000 MB/s) but Nyquist frequency is only 4 GHz — 20% lower! Channel loss at 4 GHz on FR4 is 3–5 dB better. Works on standard FR4 motherboards with equalisation — same goal, lower cost. Removes 20% 8b/10b overhead to compensate.
Figure 3 — Why Gen 3 chose 8 GT/s with 128b/130b instead of 10 GT/s with 8b/10b. Both deliver approximately 1 GB/s per lane, but the 8 GT/s option operates at a 20% lower Nyquist frequency, giving 3–5 dB better channel loss margin — enough to maintain FR4 compatibility.

128b/130b — what it buys

The transition from 8b/10b to 128b/130b in Gen 3 changes three things:

Gen 3 also added link equalisation

At 8 GT/s, the channel is lossy enough that simple pre-emphasis no longer suffices. Gen 3 introduced a formal link equalisation phase in the LTSSM — during link training, the two devices negotiate multi-tap FIR (Finite Impulse Response) filter coefficients. The transmitter and receiver exchange proposals and the best filter settings are selected before the link enters L0. This is entirely absent in Gen 1/2.

📋 Gen 4 — 16 GT/s, Pushing NRZ

Gen 4 (PCIe 4.0, 2017) doubled Gen 3’s line rate from 8 to 16 GT/s while keeping 128b/130b encoding. Same encoding change as Gen 2 was to Gen 1 — just crank the baud rate.

The Nyquist frequency jumps from 4 GHz to 8 GHz. This is still achievable on FR4 with aggressive equalization — multi-tap TX FIR coefficients, more powerful RX CTLE and DFE (Decision Feedback Equalizer). Channel trace lengths need to be shorter or use better PCB laminates.

📋 Gen 5 — 32 GT/s, the NRZ Ceiling

Gen 5 (PCIe 5.0, 2019) doubled again to 32 GT/s, still using 128b/130b. Nyquist frequency is 16 GHz. At this speed, NRZ signalling on any realistic PCB trace length is at its practical limit.

NRZ Eye Diagram — Why the Eye Closes at High Speeds Gen 1/2 — Open Eye Eye wide open good voltage margin good timing margin Gen 5 — Closing Eye ISI closes the eye DFE/FFE required to reopen it Gen 6 — PAM4 Three Eyes +3 (11) +1 (10) -1 (01) -3 (00) 3 eyes · each 1/3 voltage margin of NRZ → FEC required
Figure 4 — Eye diagram comparison. Gen 1/2 NRZ has one wide-open eye. Gen 5 NRZ has a closing eye requiring heavy equalization. Gen 6 PAM4 has three stacked eyes, each with only 1/3 the voltage margin of NRZ — which is why FEC is mandatory in Gen 6.

Gen 5 pushed equalisation to its limits. Retimers (active repeaters) are effectively mandatory on server PCB designs for runs longer than 4–6 inches. Despite this, the protocol is unchanged — same 128b/130b, same TLPs and DLLPs, same software model.

Gen 6 — 64 GT/s and Why Everything Changed

Gen 6 (PCIe 6.0, 2022) faces a hard constraint: NRZ signalling at 64 GT/s is physically impractical. Doubling to 64 GT/s with NRZ would mean a Nyquist frequency of 32 GHz — at that frequency, a 1-inch trace on FR4 has more than 20 dB of insertion loss. The signal would be unrecoverable without exotic and expensive channel materials.

The solution is a different modulation — PAM4 — combined with FEC and a new framing model. Gen 6 achieves 64 GT/s effective bit rate at only 32 GBaud (the same baud rate as Gen 5 at 32 GT/s NRZ), because PAM4 carries 2 bits per symbol.

Why PAM4 at 32 GBaud = 64 GT/s — NRZ vs PAM4 Comparison Parameter Gen 5 NRZ (32 GT/s) Hypothetical NRZ (64 GT/s) Gen 6 PAM4 (32 GBaud) Baud rate 32 GBaud 64 GBaud ← problem 32 GBaud ✓ Nyquist freq 16 GHz 32 GHz ← impractical 16 GHz ✓ (same as Gen 5) Bits/symbol 1 1 2 ← PAM4 Effective rate 32 GT/s 64 GT/s 64 GT/s at same channel ✓
Figure 5 — PAM4 achieves 64 GT/s effective bit rate at only 32 GBaud, keeping the Nyquist frequency at 16 GHz — identical to Gen 5. A hypothetical 64 GT/s NRZ would require 32 GHz Nyquist, which is impractical on standard PCB materials.
The tradeoff Gen 6 makes. PAM4 gets twice the bits per symbol at the same baud rate, but at a significant cost: the voltage gap between adjacent levels is only 1/3 of NRZ. If NRZ has a 600 mV peak-to-peak eye, PAM4’s three eyes are each only about 200 mV. That is much noisier. The raw bit error rate (BER) of PAM4 at the same signal quality is roughly 10–100× worse than NRZ. The spec target for PCIe has always been BER < 10⁻¹² (1 error per trillion bits). PAM4 without FEC cannot meet this. That is why FEC is mandatory in Gen 6 — it is not optional, it is architecturally required.

📋 PAM4 Deep Dive — Symbols, Eyes, and BER

PAM4 (Pulse Amplitude Modulation with 4 levels) encodes 2 bits into each symbol by using four distinct voltage levels. The mapping is straightforward Gray-coded to minimise errors on adjacent levels:

PAM4 Symbol Encoding — Gray Coding and Eye Structure PAM4 Symbol Mapping (Gray-coded to minimise adjacent errors) Level Bits +3 (highest) 11 +1 10 -1 01 -3 (lowest) 00 Adjacent bits differ by 1 position (Gray code) Adjacent level error = 1 bit error, not 2 PAM4 Waveform Example Transmitting: 11 10 00 01 11 10 01 00 +3 +1 -1 -3 11 10 00 01 11 10 01 00 Δ V Each symbol = one baud period = 1/(32 GBaud) = 31.25 ps Carries 2 bits → 64 Gb/s effective at 32 GBaud ΔV between levels = 1/3 of NRZ swing → higher BER BER Impact NRZ at Gen 5 channel: BER ≈ 10⁻¹⁵ (spec target) PAM4 at same channel (no FEC): BER ≈ 10⁻⁶ to 10⁻⁸ (10⁶–10⁹× worse!) PAM4 with RS FEC (Gen 6): BER < 10⁻¹⁵ restored ✓ Why Gray coding matters If receiver mistakes +1 for +3, that is 1 bit error (10→11) Binary coding would give 2 bit errors (10→11 = 2 bits differ)
Figure 6 — PAM4 symbol mapping, waveform example, and BER impact. Gray coding ensures adjacent-level errors cost only 1 bit instead of 2. Without FEC, PAM4’s raw BER is 10⁶–10⁹× worse than NRZ. Reed-Solomon FEC corrects this back to the spec target of < 10⁻¹².

📋 FEC Deep Dive — RS Codes and Correction Capability

Gen 6 uses a Reed-Solomon FEC code. The specific code parameters chosen by the PCIe 6.0 spec are RS(544, 514) — meaning 544 total symbols per codeword, 514 of which carry data and 30 carry parity. Each RS symbol is 10 bits wide.

RS(544, 514) FEC Codeword Structure and Error Correction Codeword = 544 RS symbols × 10 bits = 5440 bits total Data: 514 RS symbols × 10 bits = 5140 bits Parity: 30 × 10 = 300 bits FEC overhead = 30/544 = 5.5% at the symbol level · Payload efficiency = 514/544 = 94.5% Net x1 throughput ≈ 64 Gb/s × 94.5% ÷ 8 = ~7.56 GB/s, rounded to ~8 GB/s in spec tables Correction Capability RS(544,514) can correct up to 15 symbol errors per codeword (t = (544-514)/2 = 15) Where Correction Happens FEC decoder sits inside the Physical Layer — above the Electrical sub-block, below the Data Link Layer. DLL View DLL always sees corrected data — same LCRC and SeqNo checking as Gen 1–5. FEC is transparent to DLL.
Figure 7 — RS(544,514) FEC codeword. 514 data symbols + 30 parity symbols = 544 total. The code can correct up to 15 symbol errors per codeword. The FEC decoder operates in the Physical Layer — the Data Link Layer sees only the corrected bitstream and runs the same ACK/NAK protocol as Gen 1–5.
Why FEC is at the Physical Layer, not the Data Link Layer. Adding FEC to the DLL would mean the DLL had to know about flit boundaries, PAM4 codewords, and symbol-level error correction — it would break the clean layer separation. Putting FEC in the Physical Layer keeps the DLL identical across all generations. The DLL just gets clean bits, same as always. This is the right architecture choice.

📋 Flit Deep Dive — 256-byte Format and Protocol Impact

Gen 6 replaces start/end framing symbols with a flit-based framing model. A flit (flow control unit) is 256 bytes. Every flit is sent as a fixed-size container regardless of how many TLPs or DLLPs it carries.

Gen 6 Flit Structure — 256 Bytes Per Flit Header 8 bytes flit type TLP-A MWr, 64 bytes data Header (16B) + Payload (64B) = 80B TLP-B (start) CplD, large payload continues next flit DLLP ACK (8B) Pad FEC Parity Block RS(544,514) — 30 parity symbols corrects up to 15 symbol errors in this flit Flit layout (approximate sizes in bytes): Hdr TLP payload(s) — variable, fills available space DLL ← padding fills remaining space → FEC parity (fixed 38B approx) Key protocol differences vs Gen 1–5 framing No STP/END chars Packet boundaries are tracked within the flit header. The physical link is always “receiving a flit” — there is no inter-packet gap. ACK/NAK at flit level Replay granularity = one flit. If any TLP within a flit has an error, the whole flit is NAKed and replayed. Efficiency gain With STP/END, each TLP needs 1-2 extra symbols. Flit packing amortises flit header over many TLPs — lower per-TLP overhead.
Figure 8 — Gen 6 flit structure. The 256-byte container carries one or more TLPs, optional DLLPs, padding to fill the flit, and an FEC parity block. TLPs can span flit boundaries. ACK/NAK replay operates at flit granularity. Compared to start/end framing in Gen 1–5, flit-based framing reduces per-TLP overhead and enables clean FEC block boundaries.

📋 Clock Recovery Across Generations

Every PCIe generation embeds the clock in the data stream. There is no forwarded clock signal. The receiver uses a CDR (Clock and Data Recovery) circuit — typically a PLL — to lock onto the incoming bit transitions and extract the transmitter’s clock from them.

CDR Clock Recovery — Three Architectural Options Common Refclk Both TX and RX derive their clocks from the same 100 MHz reference. Pros: SSC straightforward, fast L0s recovery, lower jitter Typical in: desktop and server PCIe Gen 1–6 slots with Refclk pin Data-Clocked RX Receiver recovers clock entirely from the incoming bit transitions — no Refclk. Pros: simplest implementation Cons: wider CDR bandwidth needed, SSC complicates things Typical in: embedded PCIe on SOCs Independent Refclk (SRIS) Each side derives its clock from its own independent reference. Used in SRIS (Separate Refclk, Independent SSC) Pros: enables SSC on each side Cons: more complex CDR, larger elastic buffer to handle SSC drift
Figure 9 — Three CDR clock architecture options supported by PCIe. All three require the receiver to achieve and maintain bit lock on the incoming stream. In Gen 6, the CDR must lock on PAM4 symbols — a more complex process than NRZ, requiring DSP-based CDR in most implementations.

📋 Elastic Buffer — Handling ±300 ppm Clock Mismatch

Every PCIe receiver has an elastic buffer between the CDR and the Data Link Layer. Its purpose is to handle the small clock frequency difference between the transmitter and the receiver — even though both must be within ±300 ppm of the target frequency, the worst case is 600 ppm apart. That is 1 symbol difference every 1,666 symbols.

The elastic buffer absorbs this difference by adding or removing SKP symbols (Gen 1/2) or equivalent padding in Gen 3+ from the periodic SKIP ordered sets that arrive. Symbols are clocked into the buffer using the recovered clock (same rate as the transmitter) and clocked out using the local clock (which may be slightly faster or slower). Adding or removing a SKP symbol when the buffer level approaches overflow or underflow keeps the buffer within its safe operating range.

Clock scenarioBuffer behaviourCorrection
TX clock faster than RX local clock Buffer filling up → overflow risk Remove (discard) a SKP symbol from the SKIP ordered set — drain the buffer
RX local clock faster than TX clock Buffer emptying → underflow risk Insert an extra SKP symbol into the SKIP ordered set — fill the buffer
Clocks matched within tolerance Buffer level stable No modification to SKP ordered sets needed

📋 Equalization — Gen 3 Through Gen 6

At Gen 1/2 speeds, a simple fixed pre-emphasis on the TX side is sufficient to compensate for PCB trace loss. From Gen 3 onwards, the channel loss is too severe and too variable for a fixed setting — an adaptive equalization process is built into the LTSSM link training sequence.

Equalization Evolution — Gen 1/2 to Gen 6 Gen 1/2 — Fixed TX: simple 2-tap FIR pre-emphasis (fixed) RX: CTLE boost (continuous-time linear equaliser) No negotiation. Fixed parameters. Gen 3–5 — Adaptive TX: 3-tap FIR (pre, main, post) coefficients negotiated RX: CTLE + DFE (Decision Feedback EQ) LTSSM equalization phase: TX and RX exchange FIR proposals, select best Gen 6 — DSP-based TX: multi-tap FIR (5+ taps) More complex channel compensation needed RX: Full DSP equalisation FFE + DFE + ML detector FEC corrects residual errors after EQ Retimers (Gen 4–6) Active re-driver that regenerates the signal. Transparent to software (no BDF address). Gen 6: typically 1–2 retimers per link in data centre boards.
Figure 10 — Equalization complexity by generation. Gen 1/2 uses fixed pre-emphasis. Gen 3+ introduces adaptive FIR negotiation during link training. Gen 6 requires DSP-based equalisation plus FEC. Active retimers become critical from Gen 4 onward.

📋 Full Bandwidth Comparison — Every Width

GenerationLine RateEncoding x1 per dirx4 per dirx8 per dir x16 per dirx16 aggr.
Gen 12.5 GT/s8b/10b 250 MB/s1 GB/s2 GB/s 4 GB/s8 GB/s
Gen 25.0 GT/s8b/10b 500 MB/s2 GB/s4 GB/s 8 GB/s16 GB/s
Gen 38.0 GT/s128b/130b ~985 MB/s~3.9 GB/s~7.9 GB/s ~15.8 GB/s~32 GB/s
Gen 416.0 GT/s128b/130b ~1.97 GB/s~7.9 GB/s~15.8 GB/s ~31.5 GB/s~64 GB/s
Gen 532.0 GT/s128b/130b ~3.94 GB/s~15.8 GB/s~31.5 GB/s ~63 GB/s~128 GB/s
Gen 664.0 GT/sPAM4 + FEC ~7.6 GB/s~30.5 GB/s~61 GB/s ~122 GB/s~256 GB/s
Why Gen 6 x16 is ~122 GB/s per direction, not exactly 2× Gen 5. Gen 5 uses 128b/130b (1.54% overhead). Gen 6 uses PAM4 + RS(544,514) FEC (~5.5% overhead at symbol level) plus flit header bytes (~3% additional overhead). Effective payload efficiency ≈ 91–93%. At 64 GT/s PAM4 × ~91% efficiency ÷ 8 ≈ 7.3–7.7 GB/s per lane, rounded to ~8 GB/s in spec documentation. Exact values depend on payload mix and flit fill efficiency.

📋 Quick Reference

ConceptKey Point
BW formula (8b/10b)GT/s × 0.8 ÷ 8 = GB/s per lane per direction
BW formula (128b/130b)GT/s × (128/130) ÷ 8 ≈ GT/s × 0.123 GB/s per lane
Gen 1 — 2.5 GT/s8b/10b · 250 MB/s · Symbol time 4 ns · SKIP every 1180–1538 symbols
Gen 2 — 5.0 GT/s8b/10b unchanged · 2× Gen 1 frequency · LTSSM negotiates fallback to Gen 1 with Gen 1 devices
Gen 3 — 8.0 GT/s128b/130b (1.54% overhead) · 8 GT/s chosen over 10 GT/s because lower Nyquist keeps FR4 compatibility · Added adaptive link equalisation
Gen 4 — 16.0 GT/s128b/130b · ~2 GB/s per lane · Retimers common · CXL 1.x PHY
Gen 5 — 32.0 GT/s128b/130b · ~4 GB/s · NRZ practical ceiling · DSP EQ required · CXL 2.x PHY
Gen 6 — 64.0 GT/sPAM4 at 32 GBaud · 2 bits/symbol · Mandatory RS FEC · Flit framing · ~8 GB/s per lane · Same 16 GHz Nyquist as Gen 5
PAM4 Gray codingAdjacent level error = 1 bit error only. Levels: +3=11, +1=10, -1=01, -3=00
RS(544,514) FEC544 RS symbols per codeword (10 bits each) · 514 data + 30 parity · Corrects ≤ 15 symbol errors · Physical Layer, transparent to DLL
Flit256-byte fixed container · flit header + TLPs + DLLPs + padding + FEC parity · No STP/END · Replay at flit level
Elastic bufferAbsorbs ±300 ppm TX/RX clock mismatch by adding/removing SKP symbols from periodic SKIP ordered sets
Link equalisation Gen 3+LTSSM equalization phase negotiates multi-tap FIR coefficients between TX and RX · Mandatory from Gen 3
RetimersActive signal regenerators · Transparent to software (no BDF) · Mandatory for long traces at Gen 5/6
Coming next: PCIe-05 covers the Transaction Layer in depth — TLP header formats for every TLP type, byte enables, the Tag field, and worked packet diagrams for every request-completion pair.
Scroll to Top