Up until now, we have discussed how packets are formed, prioritized, and delivered. But what happens when multiple transactions are moving through the PCIe fabric at the exact same time? How does the system decide which packet needs to stay in line, and which one is allowed to pass?
In Lecture 17, we are diving into Transaction Ordering Fundamentals to understand the strict rules governing PCIe traffic and the different ordering models that keep the system running efficiently.
Part 1: Why Impose Ordering Rules?
In PCI Express, ordering rules are strictly imposed on transactions that share the same Traffic Class (TC) as they move through the fabric. If transactions have different TCs, they have absolutely no ordering relationship with one another and are treated as unrelated applications.
For packets within the same TC (and consequently the same Virtual Channel), PCIe enforces ordering rules for four critical reasons:
- Maintaining Legacy Compatibility: PCIe must remain backward compatible with legacy buses like PCI, PCI-X, and AGP.
- Ensuring Determinism: The system must guarantee that transactions complete sequentially, exactly in the order intended by the programmer.
- Avoiding Deadlocks: Without strict rules, certain traffic patterns could cause a complete system freeze (deadlock) where packets endlessly wait for one another.
- Maximizing Performance: Intelligent ordering rules help maximize throughput by minimizing read latencies and carefully managing read/write execution.
Part 2: The Three Ordering Models
To balance strict adherence to the programmer’s intent with the need for high-speed performance, the PCIe specification defines three general models for ordering transactions:
1. Strong Ordering
PCI Express inherently requires Strong Ordering for transactions that share the same Traffic Class. Because transactions with the same TC are mapped to the same Virtual Channel (VC), they follow the same rules and are generally handled sequentially. In a purely strong-ordered model, packets strictly wait their turn.
2. Weak Ordering
The problem with strictly maintaining strong ordering is that it can create massive traffic jams. If a transaction is blocked due to dependencies (such as a full receive buffer), every single transaction behind it will also be blocked. Weak Ordering solves this by allowing transactions to stay in sequence unless reordering would be helpful. If unrelated transactions are stuck behind a blocked packet, weak ordering allows those unrelated transactions to safely bypass the bottleneck, preventing a total system stall.
3. Relaxed Ordering (RO)
Relaxed Ordering takes performance optimization a step further by allowing transactions to be completely reordered, but only under certain tightly controlled conditions. If a Requester knows for a fact that a specific transaction has no dependencies on previously sent transactions, software can set the Relaxed Ordering (RO) attribute bit in the packet’s header. When switches or the Root Complex see this bit, they have permission to safely route this packet ahead of older blocked traffic. The major benefit is massively improved performance, though it requires overhead from the software to explicitly enable it.
