The full PCIe generation trajectory from 2.5 GT/s NRZ to 64 GT/s PAM4 — why each doubling required different physical layer engineering, how NRZ was pushed to 32 GT/s in Gen 5 purely through aggressive equalization, why the NRZ wall finally arrived at 64 GT/s forcing PAM4 in Gen 6, how Reed-Solomon FEC compensates for PAM4’s eye closure, and what the flit-based Transport Layer changes in Gen 6 mean for the stack.
Every PCIe generation has approximately doubled the bandwidth per lane of the previous generation. Across eight generations spanning three decades this has produced a 128× bandwidth increase from the same physical connector and backward-compatible protocol stack. The challenge is that each doubling requires increasingly sophisticated signal integrity engineering at the Physical Layer.
Gen 1 (2.5 GT/s) and Gen 2 (5 GT/s) use NRZ (Non-Return-to-Zero) signalling — the signal has exactly two levels: a logic-high voltage and a logic-low voltage. Each symbol carries one bit. The encoding is 8b/10b: every 8-bit data byte is mapped to a 10-bit code word before transmission. This 25% overhead (10 bits sent for every 8 bits of data) exists for three reasons:
The effective data rate: Gen 1 at 2.5 GT/s with 8b/10b → 2.5 × 0.8 / 8 = 250 MB/s per lane per direction. Gen 2 doubles to 5 GT/s → 500 MB/s per lane per direction.
Gen 3 doubles the bit rate to 8 GT/s but replaces 8b/10b with 128b/130b encoding. The motivation: at 8 GT/s, the 25% overhead of 8b/10b wastes too much bandwidth. 128b/130b adds only 2 bits per 128-bit block (1.5% overhead), recovering almost all of 8b/10b’s waste and effectively doubling the data throughput even without doubling the raw bit rate.
128b/130b works differently from 8b/10b:
Gen 3 also introduces 3-tap transmitter equalization (pre-cursor, cursor, post-cursor coefficients) and more aggressive receiver equalization (CTLE — Continuous Time Linear Equalizer) to compensate for channel loss at 8 GT/s. Training sequences exchange equalization presets using new TS1/TS2 fields during link training.
Effective data rate: 8 GT/s with 128b/130b → 8 × (128/130) / 8 ≈ 984 MB/s ≈ 1 GB/s per lane per direction.
Gen 4 doubles to 16 GT/s while keeping the same NRZ modulation and 128b/130b encoding as Gen 3. The doubling required significant signal integrity improvements:
Effective data rate: 16 GT/s × (128/130) / 8 ≈ 2 GB/s per lane per direction. A x16 Gen 4 link delivers 32 GB/s bidirectional — sufficient for NVMe SSDs and NVIDIA A100-class GPUs.
Gen 4 reached 16 GT/s NRZ and was widely adopted. However, a further direct NRZ doubling to 32 GT/s without significantly more aggressive equalization would be impractical on standard add-in card trace lengths. Gen 5 achieved 32 GT/s NRZ by applying significantly more aggressive equalization (no RS-FEC, no PAM4). The true NRZ wall arrived at 64 GT/s — the point where even the best equalization cannot open the NRZ eye — which is why Gen 6 was the first generation to switch to PAM4 and to introduce RS-FEC.
Gen 5 achieved 32 GT/s NRZ with aggressive equalization. But doubling again to 64 GT/s with NRZ is where the physics becomes prohibitive — two compounding effects make it impractical at standard PCB trace lengths:
The PAM4 solution for Gen 6: instead of doubling the symbol rate (which would push Nyquist to 32 GHz and close the NRZ eye), keep the symbol rate at 32 GBaud but carry 2 bits per symbol. PAM4 uses four voltage levels — each symbol encodes 2 bits. The Nyquist frequency stays at 32 GHz (same as a 64 GT/s NRZ link would require), but now each symbol carries 2 bits — achieving 64 GT/s effective bit rate with a manageable channel bandwidth. PAM4 is the only viable path to 64 GT/s on standard PCB channels.
PAM4 (Pulse Amplitude Modulation with 4 levels) uses four distinct voltage levels to encode 2 bits per symbol. The four levels (named by their Gray coding): 00b, 01b, 11b, 10b from most negative to most positive. Because each transmitted symbol now carries 2 bits instead of 1, twice the data rate can be achieved at the same symbol rate (same Nyquist frequency).
PAM4 uses Gray coding to map 2-bit dibit values to voltage levels. Adjacent voltage levels differ by only one bit: 11→10→00→01 (reading from top to bottom voltage level). Gray coding minimises the number of bit errors when the signal amplitude is misinterpreted as an adjacent level — the most likely error — because a one-level slip causes only a 1-bit error rather than a 2-bit error.
A PAM4 eye diagram shows three eye openings (between the four voltage levels) rather than the single eye opening of NRZ. Each eye’s height is roughly one-third of the full voltage swing — making each eye significantly smaller than an equivalent NRZ signal at the same symbol rate. This is the fundamental trade-off of PAM4: twice the bit rate at the cost of a more demanding signal integrity requirement.
Forward Error Correction (FEC) is a technique where the transmitter adds redundant check symbols to the data stream. If some symbols are received incorrectly (due to noise, ISI, or crosstalk), the receiver uses the redundant symbols to detect and correct the errors — without any retransmission. “Forward” means the correction uses only the received data; no feedback to the transmitter is needed.
Why RS-FEC is necessary for Gen 6 PAM4 but not for Gen 1–5 NRZ:
PCIe 6.0 uses Reed-Solomon (RS) FEC. This is not used in Gen 5 — RS-FEC is specific to Gen 6’s PAM4 modulation, where the smaller eye opening produces far higher raw BER than NRZ links can tolerate. Reed-Solomon operates on symbols (bytes or multi-bit groups) rather than individual bits, making it particularly efficient at correcting burst errors — which is the dominant error pattern in PAM4 links where a momentary amplitude disturbance tends to corrupt several consecutive symbols.
RS-FEC adds latency at both the transmitter (must accumulate a full codeword before sending) and receiver (must receive and decode the full codeword before forwarding to the Data Link Layer). For PCIe Gen 6, RS-FEC latency is approximately 4–8 ns per link hop — a new source of latency that does not exist in Gen 1–5. For multi-hop topologies with switches, RS-FEC latency accumulates at each hop. This is why L0s exit latency is higher at Gen 6 compared to Gen 1–5 — each L0s exit must re-establish RS-FEC sync in addition to re-locking the CDR.
PCIe 5.0 (ratified 2019) doubles Gen 4 bandwidth by pushing NRZ to 32 GT/s. Gen 5 does not use PAM4 and does not use Reed-Solomon FEC — it remains NRZ (two voltage levels, one bit per symbol), exactly like Gen 1–4. What makes 32 GT/s NRZ viable is significantly more aggressive equalization (additional DFE taps, stricter Tx pre-emphasis, tighter jitter budget) and 128b/130b encoding unchanged from Gen 3/4. Gen 5 is the last NRZ generation.
Key Gen 5 changes relative to Gen 4:
Effective bandwidth: 32 GT/s NRZ × 1 bit/symbol × 128/130 (encoding overhead) / 8 = approximately 3.94 GB/s ≈ 4 GB/s per lane per direction. For a x16 Gen 5 link: 16 lanes × 2 directions × 4 GB/s = 128 GB/s bidirectional.
PCIe 6.0 (ratified 2022) is the first generation to use PAM4. It doubles the bandwidth of Gen 5 NRZ by switching modulation: the symbol rate stays at 32 GBaud (same Nyquist frequency as Gen 5), but each symbol carries 2 bits instead of 1. The effective bit rate is 64 GT/s. Physical Layer changes relative to Gen 5:
Effective bandwidth: for a x16 Gen 6 link: 16 lanes × 2 directions × 8 GB/s = 256 GB/s bidirectional. The 8 GB/s per lane per direction accounts for FEC and flit overhead (approximately 93% efficiency).
Gen 6 introduces the most significant change to the PCIe protocol stack since Gen 3: a new flit-based (FLow unIt) Transport Layer that replaces the variable-length TLP-based framing used in Gen 1–5. This change is unique to Gen 6 and is designed specifically to improve efficiency at 64 GT/s.
In Gen 1–5, the Physical Layer sends: SKIP ordered sets (to compensate for clock differences) + framing tokens (SFRM, EFRM for each TLP) + the actual TLP header and data + LCRC. Each TLP is individually framed and CRC-protected. The overhead from framing tokens, SKIP ordered sets, and LCRC per TLP is small but meaningful at high symbol rates.
In Gen 6, the Physical Layer sends a continuous stream of fixed-size 256-byte flits. Multiple TLPs are packed into flits. A single flit may contain one large TLP, multiple small TLPs, or fractions of TLPs that span multiple flits. The CRC in Gen 6 (the RS-FEC check symbols) protects the entire flit — there is no per-TLP LCRC in flit mode.
| Property | Standard TLP Mode (Gen 1–5) | Flit Mode (Gen 6) |
|---|---|---|
| Frame unit | Variable-length TLP (12 B–4112 B) | Fixed 256-byte flit |
| Framing | SDS/EIOS/EDS tokens per TLP | Flit header (2 B) per 256-byte flit |
| CRC per TLP | Yes — 4-byte LCRC per TLP | No — eliminated |
| Error correction | LCRC detects errors → ACK/NAK replay | RS-FEC corrects errors → no replay needed for FEC-correctable errors |
| Skip ordered sets | Inserted periodically for elastic buffer management | Flit header contains elastic buffer management fields — no separate SKIP symbols |
| ACK/NAK protocol | Required — Data Link Layer ack/nak every TLP (via DLLP) | Simplified — ACK/NAK still present but at flit granularity, not TLP granularity |
| Encoding overhead | 128b/130b: 1.54% overhead | FEC: 5.1% overhead, but flit packing recovers more than 1.54% |
| Backward compatible | All Gen 1–5 links use this | Gen 6 links only — negotiated during link training |
| Software visible | No (Physical Layer detail) | No — TLPs still look identical to software; flit/no-flit is invisible above Physical Layer |
At Gen 5 and Gen 6 speeds, the channel (PCB trace + package trace + connector) causes severe inter-symbol interference (ISI) — a transmitted symbol’s energy spreads into adjacent symbol slots, distorting them. Equalization compensates for this distortion:
| Technique | Where | How it works | Gen 5/6 usage |
|---|---|---|---|
| Tx Pre-Emphasis (FFE) | Transmitter | Feed-forward equalizer. Boosts high-frequency components of the transmitted signal before the channel attenuates them. Controlled by C-1 (pre-cursor), C0 (cursor), C1 (post-cursor) tap coefficients. | 3–5 taps at Gen 5 NRZ. Extended range and more complex coefficient space at Gen 6 PAM4 (must manage four voltage level transitions independently). |
| CTLE | Receiver | Continuous Time Linear Equalizer. Analog filter that boosts high frequencies at the receiver input — inverse of channel frequency response. Passive compensation always on. | Required at Gen 5/6. Higher gain needed at 32/64 GT/s. |
| DFE | Receiver | Decision Feedback Equalizer. Uses previous decoded symbols to subtract their ISI contribution from the current symbol. More powerful than CTLE for severe ISI but adds latency. | Strongly recommended at Gen 5/6. More taps needed than Gen 4. |
| PAM4 Receiver DSP | Receiver | Digital Signal Processing for PAM4 level detection, per-level threshold calibration, and multi-level eye monitoring. | Gen 6 only. Not present in Gen 5 NRZ designs. Gen 5 uses standard NRZ 2-level eye monitoring. |
Equalization coefficients are negotiated during link training (the Gen 3+ equalization Phase 1/2/3 in the LTSSM Configuration and Recovery states). The link partner communicates its receiver’s preferred transmitter coefficients using TS1/TS2 fields. Gen 5/6 add more Phase iterations and a wider search space to find the optimal equalization operating point.
At Gen 5 and Gen 6 speeds, the PCIe channel insertion loss budget may not accommodate standard add-in card trace lengths (typically 12–20 cm on a motherboard plus a PCIe cable or riser). Retimers and Redrivers extend the channel reach:
| Device type | How it works | Transparent to protocol? | Gen 5/6 requirement |
|---|---|---|---|
| Redriver | Linear amplifier. Boosts the analog signal without regenerating it. Adds gain but does not recover clock or data — still subject to accumulated jitter. Simpler and lower latency. | Fully transparent | May be sufficient for Gen 5 in short reach applications (<30 cm total) |
| Retimer | Full CDR — Clock and Data Recovery. Recovers the clock and data from the incoming signal, regenerates a clean signal from scratch. Eliminates accumulated jitter. PCIe-spec compliant retimers participate in link training and equalization negotiation. | Spec-defined transparent: appears as extending the channel but does not affect BDF addressing or topology | Often required for Gen 5 beyond 30 cm, and for most Gen 6 channel lengths |
PCIe-spec retimers (defined from Gen 3 onwards) are specification-compliant active devices that participate in the LTSSM training. They pass TS1/TS2 ordered sets and handle equalization negotiations on each segment independently, allowing the overall channel to be split into shorter segments each with manageable insertion loss.
PCIe’s most powerful feature across all generations is backward compatibility. A Gen 6 device connected to a Gen 3 system will train the link to Gen 3 speeds (8 GT/s, 128b/130b, no FEC, no flit mode) and operate fully. A Gen 3 device in a Gen 6 system trains the link to Gen 3 speeds. Neither device loses functionality — only bandwidth is limited to the common generation.
| Generation | GT/s | Modulation | Encoding | FEC | GB/s / lane / dir | x16 BW (bidir) | Year |
|---|---|---|---|---|---|---|---|
| Gen 1 | 2.5 | NRZ | 8b/10b | None | 0.25 | 8 GB/s | 2003 |
| Gen 2 | 5 | NRZ | 8b/10b | None | 0.5 | 16 GB/s | 2007 |
| Gen 3 | 8 | NRZ | 128b/130b | None | ~1 | 32 GB/s | 2010 |
| Gen 4 | 16 | NRZ | 128b/130b | None | ~2 | 64 GB/s | 2017 |
| Gen 5 | 32 | NRZ | 128b/130b | None | ~4 | 128 GB/s | 2019 |
| Gen 6 | 64 | PAM4 | Flit mode | RS-FEC | ~8 | 256 GB/s | 2022 |
| Item | Value / Rule |
|---|---|
| 8b/10b overhead | 25% — 10 bits sent per 8 data bits. Used in Gen 1 and Gen 2 only. |
| 128b/130b overhead | 1.54% — 2 sync header bits per 128 data bits. Used in Gen 3, Gen 4, and Gen 5 (all NRZ). |
| NRZ definition | Non-Return-to-Zero: 2 voltage levels, 1 bit per symbol. Used in Gen 1–4. |
| PAM4 definition | Pulse Amplitude Modulation-4: 4 voltage levels, 2 bits per symbol (Gray coded). Used in Gen 6 only. Gen 5 uses NRZ. |
| PAM4 Nyquist frequency | Equal to half the symbol rate. Gen 6 PAM4 at 64 GT/s symbol rate: Nyquist = 32 GHz. (Gen 5 NRZ at 32 GT/s: Nyquist = 16 GHz — same as Gen 6 PAM4. This is the elegance of PAM4: Gen 6 doubles bit rate while keeping the same Nyquist frequency as Gen 5.) |
| PAM4 eye penalty | Each of the three eyes is ≈1/3 the height of an NRZ eye at the same symbol rate. Raw BER ≈10⁻⁵ to 10⁻⁶ vs NRZ 10⁻¹² before FEC. |
| RS-FEC purpose (Gen 6) | Correct raw symbol errors from PAM4’s BER floor (~10⁻⁵–10⁻⁶) before they reach the Data Link Layer. Not needed for Gen 5 NRZ which achieves ~10⁻¹² raw BER through equalization alone. |
| Reed-Solomon FEC | Symbol-based error correction. Gen 6 only. RS(272,258): 14 parity symbols, corrects up to 7 symbol errors per codeword. Not used in Gen 5 (NRZ achieves sufficient BER via equalization). |
| RS-FEC overhead (Gen 6 only) | 5.1% — 14 parity symbols per 272-symbol codeword. Gen 5 has no FEC overhead. |
| RS-FEC latency penalty | ~4–8 ns per link hop (Gen 6 only). Accumulates in multi-hop topologies. Increases L0s exit latency vs Gen 1–5. Not present in Gen 5 NRZ. |
| Flit definition | Fixed 256-byte Transport Layer unit introduced in Gen 6. Multiple TLPs packed per flit. No per-TLP LCRC — FEC handles error correction. |
| Flit mode visibility | Transparent to software and drivers. TLP format unchanged. Only Physical Layer and Data Link Layer implementation changes. |
| Backward compatibility | Always maintained. Gen 6 hardware trains to Gen 1 with older links. Link speeds negotiated via Link Capability registers during Recovery. |
| Equalization (Gen 5) | Tx FFE (3–5 taps), CTLE, DFE. NRZ 2-level — more aggressive than Gen 4 but standard NRZ eye monitoring. No PAM4 DSP. |
| Equalization (Gen 6) | Tx FFE (extended range), CTLE, DFE, PAM4 multi-level DSP for 4-level threshold calibration. Three eyes must all be monitored independently. |
| Retimer recommendation | Often required for Gen 5 >30 cm trace, and for most Gen 6 channel lengths. Spec-defined retimers participate in LTSSM training. |
| Gen 6 Physical Layer Cap ID | 002Ch — Physical Layer 64.0 GT/s Capability. Reports equalization status, FEC capability, L0p state. |
| Gen 5 Physical Layer Cap ID | 002Ah — Physical Layer 32.0 GT/s Capability. |
| Gen 7 status | In development at PCI-SIG as of 2024. Target 128 GT/s. Technology path (higher PAM levels, coherent optics, or further PAM4 SI) not finalised publicly. |