Why 8b/10b was abandoned at Gen 3, how 128b/130b works — 16-byte data blocks, the 2-bit sync header, block types, framing tokens, the Gen 3 scrambler, block alignment, and how the encoding carries through Gen 4, Gen 5, and into Gen 6’s flit model.
Gen 3 was the first PCIe generation to double bandwidth without doubling frequency. Gen 1 ran at 2.5 GT/s, Gen 2 at 5 GT/s. Simply doubling to 10 GT/s for Gen 3 was considered impractical — the signal conditioning required at 5 GHz Nyquist frequency would demand expensive board materials and aggressive equalization that would price PCIe out of mainstream use.
The solution came from a different direction: keep the frequency increase modest (5 GT/s to 8 GT/s is only 60% more) and reclaim the 20% overhead that 8b/10b encoding had been wasting. The result is Gen 3 at 8 GT/s with ~98.5% efficiency — delivering approximately the same useful throughput as a hypothetical 10 GT/s system with 8b/10b.
Three specific problems with simply doubling frequency drove this decision: higher frequencies require far more expensive PCB laminate materials due to dielectric loss; signal conditioning logic (equalization) at 5 GHz+ is complex and power-hungry; and the existing PCIe connector and board infrastructure designed for 5 GT/s would have needed complete redesign for 10 GT/s signals.
The arithmetic behind 128b/130b is simple but important to understand precisely:
128b/130b groups 128 bits (16 bytes) of data into a single block and prepends a 2-bit sync header. This gives 130 bits on the wire for every 128 bits of payload — hence the name. Unlike 8b/10b which operates symbol-by-symbol, 128b/130b operates block-by-block.
The block boundary is fundamental to how Gen 3 framing works. Unlike 8b/10b where any K-code can appear between data characters, 128b/130b has no K-codes in the middle of data. Instead, the block structure defines what kind of content is present. A receiver must first achieve block alignment — knowing exactly where each 130-bit block starts in the serial bitstream — before it can interpret any data.
The sync header is always exactly 2 bits, transmitted first in the block. Only two values are defined:
The sync header divides blocks into two categories with fundamentally different behaviour:
| Property | Data Block (sync=01) | Ordered Set Block (sync=10) |
|---|---|---|
| Content | TLP bytes · DLLP bytes · STP/SDP framing tokens · IDL padding · EDB · EDS tokens | One of the defined ordered sets: SKIP, EIEOS, EIOS, SDS, FTS |
| Scrambling | Yes — all 128 payload bits are scrambled with the LFSR | No — sent in the clear, receiver must see exact patterns for alignment |
| Lane requirement | Content may differ per lane (each lane carries its stripe of the packet) | Same on all active lanes simultaneously — required for lane-to-lane deskew |
| Data Stream context | Part of the Data Stream — TLPs and DLLPs flow in data blocks | Interrupts the Data Stream when it is something other than SKIP; SKIP ordered sets may appear within a Data Stream without interrupting it |
| How framed | STP token marks TLP start; SDP marks DLLP start; length from STP token counts end | Entire block is the ordered set — no separate framing tokens needed |
In 8b/10b, K-codes (STP, SDP, END, EDB) are control characters that mark packet boundaries. In 128b/130b there are no K-codes — framing is done instead by special Framing Tokens embedded within data blocks. These tokens are specific byte patterns that the receiver recognises within the 128-bit payload of a data block.
In 8b/10b, the TLP framing was: STP K-code at start, END K-code at end. The receiver had to wait to see the END to know where the TLP finished. In 128b/130b the STP token includes the TLP’s complete DW count as an 11-bit length field. The receiver can calculate exactly where the TLP ends from the moment it sees the STP. This enables faster cut-through forwarding at switches and simpler receiver logic.
The STP length field also has a 4-bit Frame CRC and an additional parity bit to protect against errors in the length itself — an error in the length field would cause the receiver to misalign on all subsequent packet boundaries until recovery. The triple-bit-flip detection capability of this combined protection makes the length field very robust.
Every K-code from 8b/10b that had a function in PCIe Gen 1/2 is replaced in Gen 3 by either a framing token or an ordered set mechanism:
| 8b/10b K-code | Function | Gen 3 replacement |
|---|---|---|
| COM (K28.5) | Ordered set start, symbol lock | Block alignment via sync header — receiver finds block boundaries by searching for the valid 01/10 sync header pattern |
| STP (K27.7) | Start of TLP | STP framing token (2 bytes in data block with embedded length) |
| SDP (K28.2) | Start of DLLP | SDP framing token (2 bytes in data block) |
| END (K29.7) | End of good packet | Not needed — TLP end is calculated from STP length field. Absence of EDB means packet is good. |
| EDB (K30.7) | End of bad (nullified) packet | EDB framing token (4 bytes = four EDB bytes) appended to nullified TLPs |
| SKP (K28.0) | Clock compensation | SKIP ordered set — still exists but now an ordered set block (sync=10) instead of K-code characters; SKP bytes may be added/removed |
| FTS (K28.1) | L0s exit training | FTS ordered set block — same purpose, now sync=10 block type |
| IDL (K28.3) | Logical idle, electrical idle entry | IDL framing token for logical idle; EIOS ordered set block for electrical idle entry |
| PAD (K23.7) | Lane padding on wide links | IDL framing tokens fill unused space in data blocks |
| EIE (K28.7) | Electrical idle exit | EIEOS ordered set block (same purpose, now sync=10) |
In 8b/10b, scrambling was optional (a bit in the training sequence could disable it). In Gen 3, scrambling is mandatory and cannot be disabled. This is because 128b/130b encoding no longer guarantees transition density or DC balance by itself — it only provides the 2-bit sync header. The scrambler is the only mechanism that prevents long runs of the same bit value in the 128-bit payload, ensures sufficient transitions for CDR (Clock and Data Recovery), and maintains DC balance across the link.
The Gen 3 scrambler uses a 23-bit Linear Feedback Shift Register (LFSR), significantly more complex than the 16-bit LFSR used in Gen 1/2. The generator polynomial is x²³ + x¹⁸ + 1. Each lane has its own independent LFSR, seeded with a different initial value per lane — this ensures that even if all lanes carry identical data bytes, the scrambled bitstreams on adjacent lanes are different, preventing crosstalk from becoming coherent.
| Property | Gen 1/2 scrambler | Gen 3+ scrambler |
|---|---|---|
| LFSR length | 16 bits | 23 bits |
| Generator polynomial | x¹⁶ + x⁵ + x⁴ + x³ + 1 | x²³ + x¹⁸ + 1 |
| Reset/resync trigger | Every COM K-code resets all lanes’ LFSRs simultaneously | Every EIEOS ordered set resets all lanes to defined per-lane seeds |
| Per-lane seeding | Same seed on all lanes | Different seed per lane — intentional scrambling diversity |
| Disable option | Yes — “disable scrambling” bit in TS1/TS2 | No — cannot be disabled at 8 GT/s or higher |
| What is scrambled | Data bytes before 8b/10b encoding · K-codes not scrambled | All data block payload bytes · ordered set blocks not scrambled |
Before a Gen 3 receiver can decode any data, it must achieve block alignment — determining exactly which bit in the incoming serial stream is the first bit of each 130-bit block. This replaces the symbol lock that 8b/10b achieved using the COM character’s unique pattern.
The procedure for achieving block alignment:
On a multi-lane link, the parallel bitstreams on different lanes travel slightly different path lengths and through slightly different electrical characteristics. They arrive at the receiver at slightly different times — lane-to-lane skew. The receiver must re-align all lanes before reassembling the packet byte stream.
In 8b/10b, the COM K-code served as the deskew reference — it appeared on all lanes simultaneously. In Gen 3, COM no longer exists. Deskew is performed using ordered set blocks, which must be transmitted simultaneously on all lanes. Any ordered set can serve as the deskew marker, but SKIP (SOS), SDS (Start Data Stream), and EIEOS are most commonly used because they appear regularly.
| Property | Gen 1/2 | Gen 3 |
|---|---|---|
| Deskew reference | COM K-code (K28.5) detected on all lanes simultaneously | Ordered set blocks (SKIP/SDS/EIEOS) appearing simultaneously on all lanes |
| Max receivable skew | 20 ns (Gen 1) / 8 ns (Gen 2) = 5–4 symbol times | 6 ns = 6 symbol times at 1 ns per symbol |
| Mechanism | Delay early-arriving COM characters until all lanes are in sync | Delay early-arriving ordered set blocks until all lanes show the ordered set simultaneously |
| Ongoing deskew | Every COM character provides an opportunity for adjustment | SKIP ordered sets (SOS) sent every 370–375 blocks provide regular adjustment opportunities |
Clock tolerance compensation still works in Gen 3 via SKIP ordered sets (SOS), but the mechanism differs from 8b/10b. In Gen 1/2, the transmitter could insert SKP K-codes at fairly fine granularity within the bitstream. In Gen 3, insertion and deletion happen at block boundaries in multiples of 4 SKP symbols (bytes) per SOS.
Gen 4 (16 GT/s) and Gen 5 (32 GT/s) both continue to use 128b/130b encoding. The block structure, sync header values, data/ordered set distinction, framing tokens, and scrambler mechanism are all carried forward unchanged. What changes generation-to-generation is the raw bit rate and the equalization requirements.
| Generation | Data rate | Encoding | Useful BW per lane | Key additions vs Gen 3 |
|---|---|---|---|---|
| Gen 3 | 8 GT/s | 128b/130b | ~984 MB/s | Introduction of 128b/130b, mandatory scrambling, 3-tap Tx FIR |
| Gen 4 | 16 GT/s | 128b/130b | ~1.97 GB/s | Wider Tx FIR coefficient range, stricter eye mask, retimer support |
| Gen 5 | 32 GT/s | 128b/130b | ~3.94 GB/s | Tighter coefficient resolution, adaptive equalization, FEC optional |
| Gen 6 | 64 GT/s (PAM4) | Flit + FEC | ~7.5 GB/s | PAM4 modulation, mandatory RS FEC, 256-byte flit framing, no 128b/130b |
For Gen 4 and Gen 5, the encoding overhead and block structure are identical to Gen 3. The equalization training (Recovery.Equalization state in the LTSSM) runs more iterations with a larger coefficient search space, and the physical channel requirements become tighter — lower-loss laminates, tighter impedance control, and more aggressive equalization are all needed.
Gen 6 does not use 128b/130b encoding. The switch to PAM4 modulation at 32 Gbaud creates challenges that 128b/130b was not designed to handle. Instead, Gen 6 uses a completely different approach: flit-based framing with mandatory FEC.
PAM4’s reduced eye opening means bit errors occur far more frequently than with NRZ. At 10⁻⁶ raw BER, a Gen 6 x16 link would have roughly one uncorrected bit error every microsecond without FEC — each requiring an ACK/NAK replay that consumes far more bandwidth than the original error saved. 128b/130b has no error correction — it can detect a bad sync header but cannot fix it. FEC is the only practical solution at PAM4 error rates.
Additionally, 128b/130b’s 130-bit block size is too fine-grained for efficient FEC coding. RS(544,514) operates on 256-byte (2048-bit) codewords — more than 15 times larger than a 130-bit block. Flit-based framing was designed specifically to match the FEC block size, giving the RS code enough data to correct multiple symbol errors per block efficiently.
| Item | Value / Rule |
|---|---|
| Reason for change | 8b/10b’s 20% overhead made doubling Gen 2’s bandwidth impossible at a reasonable frequency increase. Dropping 8b/10b and using 128b/130b delivers nearly the same useful throughput at 8 GT/s as 10 GT/s with 8b/10b would have. |
| Block structure | 130 bits total: 2-bit sync header + 128-bit payload (16 bytes) |
| Overhead | 2/130 = ~1.54% — vs 20% for 8b/10b |
| Sync header: 01 | Data block — payload contains TLPs, DLLPs, framing tokens, IDL. All bytes scrambled. |
| Sync header: 10 | Ordered set block — payload contains ordered set pattern. Not scrambled. Same on all lanes. |
| Sync header: 00 or 11 | Illegal — block alignment error. Link must retrain. |
| Framing tokens | STP (with length+CRC) · SDP · EDB (4 bytes) · EDS · IDL — embedded inside data blocks |
| STP improvement | Includes 11-bit TLP DW count + 4-bit Frame CRC + parity. Receiver knows where TLP ends from start, without waiting for END K-code. |
| No END K-code | TLP end determined by STP length field. If no EDB follows, packet is assumed good. |
| Scrambling | Mandatory at Gen 3 and above — cannot be disabled. Only mechanism for DC balance and transition density. |
| Gen 3 LFSR | 23-bit, polynomial x²³+x¹⁸+1. Different seed per lane. Reset on EIEOS ordered set. |
| Block alignment | Receiver finds 130-bit block boundaries by searching for valid 01/10 sync headers at 130-bit intervals. Replaces 8b/10b symbol lock via COM. |
| Deskew reference | Ordered set blocks (SKIP/SDS/EIEOS) appearing simultaneously on all lanes. Replaces COM-based deskew of Gen 1/2. |
| SKIP ordered set | Every 370–375 data blocks. Sync=10 block. Receiver adds/removes 4 SKP bytes for clock compensation. No consecutive SOS blocks allowed at Gen 3. |
| Gen 4 and Gen 5 | Same 128b/130b block structure, same tokens, same scrambler approach. Only line rate and equalization complexity change. |
| Gen 6 | Does not use 128b/130b. PAM4 with 256-byte flit framing and RS(544,514) FEC replaces it. FEC corrects 10⁻⁶ raw BER to 10⁻¹⁵ effective. |
| Generations using 128b/130b | Gen 3 (8 GT/s), Gen 4 (16 GT/s), Gen 5 (32 GT/s) |