Before we touch a single register or draw a single TLP โ let’s talk about what this series is, who it’s for, what you’re going to learn, and why PCIe is worth understanding deeply.
PCIe is everywhere. Your GPU, NVMe SSD, network card, AI accelerator โ they all speak PCIe. It is the backbone of modern high-performance computing, and yet most engineers who work with it every day only understand their own small slice of it. They know how to write a driver, or how to implement a controller, but if you asked them to explain what happens between a CPU issuing a memory read and the data actually coming back โ they’d struggle.
This series fixes that. We’re going to build a complete, layered understanding of PCIe from the physical layer all the way up to software enumeration โ the same way the spec builds it, but without the dry formality of reading the spec itself.
RTL / VLSI engineers implementing PCIe controllers, endpoints, or root complexes who want the full protocol picture, not just their layer.
Firmware and driver engineers who configure PCIe but want to understand what the hardware is actually doing underneath.
Students and fresh engineers preparing for interviews or joining teams that work on PCIe-based SoCs, storage, or networking.
You should be comfortable with basic digital logic and have a rough idea of what a bus protocol is. You don’t need to have read the PCIe spec โ that’s what this series is for.
PCIe has a clean layered architecture and we’ll follow it bottom-up, then flip to the software side at the end. Five phases:
Why PCIe replaced PCI, the key design goals, topology (root complex, switch, endpoint), and how the three layers stack.
TLP types (memory, I/O, config, message), packet structure, headers, addressing, ordering rules, and completion handling.
DLLPs, ACK/NAK reliability, sequence numbers, replay buffer, and flow control credit types โ the link reliability engine.
Lanes, differential signalling, 8b/10b and 128b/130b encoding, link training, and the LTSSM state machine.
Type 0/1 headers, BARs, capability structures, extended config space, enumeration, interrupts, ASPM, and device power states.
AER, SR-IOV, DMA and IOMMU, and the big architectural changes in PCIe 5.0 and 6.0 โ PAM4, FEC, and flit-based framing.
Same approach as the SystemVerilog series โ every post covers one topic completely. No “see the spec for details.” If something matters enough to mention, it’s explained right there in the post.
Every post has real packet diagrams, state machine walkthroughs, field-by-field register breakdowns, and practical “what actually happens” explanations โ not just definitions.
We also won’t pepper you with spec section numbers or version watermarks. If something changed between Gen 1 and Gen 5, we’ll say so clearly. Otherwise assume it applies broadly.
PCIe is a genuinely complex protocol. The LTSSM alone has 11 states and dozens of substates. The ordering rules for TLPs have enough edge cases to fill a whole post. The power management interactions are subtle.
We’re not going to pretend it’s simple. But we are going to make it clear โ which is different. Clear means you understand why each piece exists, not just what it is. That understanding is what makes the hard parts stick.
The first post covers PCIe fundamentals โ what problem it solves, how it replaced PCI, and the three-layer architecture that everything else builds on. See you there.
If something in the series is unclear, incomplete, or you want a topic covered that we haven’t planned yet โ reach out. This is a living series and the best additions come from real questions from people working on real problems.
โ The VLSI Trainers Team