To effectively communicate with hardware, system software needs a way to access a device’s internal registers and storage locations to control its behavior, check its status, or deliver data. To make this possible, these internal device locations must be assigned specific addresses from one of the address spaces supported by the system.
Here is a breakdown of how devices utilize legacy IO space, the shift to Memory-Mapped IO (MMIO), and the critical differences between Prefetchable and Non-Prefetchable memory.
The Shift from Legacy IO to Memory-Mapped IO (MMIO)
In the early days of personal computers, software accessed a device’s internal registers primarily through a dedicated IO address space defined by Intel. However, due to various limitations and undesirable effects associated with IO space, hardware and software vendors quickly moved away from it.
Instead, designers began mapping these internal device registers directly into the system’s memory address space—a practice known as Memory-Mapped IO (MMIO).
To handle the transition, it became common practice for devices to map the exact same set of registers into both IO address space and MMIO. This dual-mapping strategy allowed older legacy software to continue functioning using the IO space, while modern software could utilize the highly efficient MMIO space. Today, the PCI Express specification actively discourages the use of IO address space, noting that it is only supported to maintain legacy compatibility and may be officially deprecated in future revisions. Modern PCIe endpoints designed without legacy constraints operate purely as MMIO devices.
Deep Dive: Prefetchable vs. Non-Prefetchable Memory
When a device requests MMIO space, it must specify whether that space should be Prefetchable (P-MMIO) or Non-Prefetchable (NP-MMIO). The distinction between the two relies entirely on how the device reacts when its memory is read.
Understanding Read Side-Effects To understand prefetching, we must first understand read side-effects. A read side-effect occurs when the simple act of reading a memory location fundamentally changes the state of the target device.
For example, imagine a memory-mapped status register that is designed to automatically clear itself the moment it is read. This hardware trick saves the software programmer from having to execute a separate “clear” command. Because reading this location alters the data, it possesses a read side-effect and must be classified as Non-Prefetchable memory.
The Rules of Prefetchable Space By contrast, Prefetchable space has two strict attributes: reads do not have side effects, and write merging is allowed.
Because reading this memory does not alter its state, the system is allowed to speculatively “prefetch” data ahead of time. If a requester asks to read 128 bytes, the target might proactively gather the next 128 bytes as well, anticipating that the requester will want them soon.
- If the data is needed: Performance is massively improved because the data is already waiting in a buffer.
- If the data is NOT needed: The system can safely discard the prefetched bytes to free up buffer space. Because the memory has no read side-effects, the original data remains perfectly intact at the source and can be fetched again later if necessary.
(Note: If a system accidentally prefetched data from a register with read side-effects, discarding the unused data would permanently destroy it, making recovery impossible.)
Why Does This Distinction Still Exist?
The concept of prefetchable memory was vastly more important in legacy PCI architectures than it is in PCIe. In legacy PCI, read transactions did not provide a byte count in advance. When a read had to cross a bridge, the bridge had to blindly guess how much data to gather; guessing wrong introduced massive latency, making the ability to safely prefetch data crucial for bus performance.
Because modern PCIe read requests explicitly state their exact transfer size in the packet header, this guesswork is no longer necessary. However, the distinction between Prefetchable and Non-Prefetchable space is fully carried forward into the PCIe standard to maintain seamless backward compatibility with legacy PCI software.
