P3. PCI Transaction Models: Programmed I/O, DMA, and Peer-to-Peer

In our previous lectures, we discussed the hardware foundations and bus cycles of the PCI standard. Now, it is time to look at how data actually moves across the bus. PCI utilizes three primary models for data transfer: Programmed I/O (PIO), Direct Memory Access (DMA), and Peer-to-Peer transfers.

Here is a breakdown of how each model operates, its historical context, and its practical efficiency.

1. Programmed I/O (PIO): The CPU Does the Heavy Lifting

Programmed I/O was highly common in the early days of the PC. At the time, device designers were reluctant to add the expense and complexity of dedicated transaction management logic to their peripherals. Because the central processor was faster than other devices and generally running single-tasking operations, it was assigned to handle all the work.

How it works: If a PCI device needs to move data to system memory, the CPU must read the data from the PCI device into one of its own internal registers, and then manually copy that register to the memory. Moving data the other way requires the exact reverse process.
The drawback: PIO is highly inefficient for two major reasons. First, it requires two complete bus cycles to be generated by the CPU for every single data transfer. Second, it ties up the processor with tedious data housekeeping rather than allowing it to perform more complex, interesting work. While it is no longer the primary method for moving large amounts of data, PIO is still a necessary model for system software to initially interact with a device.

2. Direct Memory Access (DMA): Offloading the Processor

As system demands grew, the inefficiencies of PIO were no longer acceptable, leading to the adoption of Direct Memory Access (DMA) as the preferred data transfer method.

How it works: In this model, the details of a memory transfer are handled by a DMA engine on behalf of the processor. The CPU simply programs a starting memory address and a byte count into the DMA engine, and the engine takes over the bus protocol and address sequencing entirely on its own.
The evolution to Bus Masters: Over time, improved integration allowed peripheral designers to build this DMA functionality directly into their devices. These intelligent peripherals, known as “Bus Master” devices, are capable of taking control of the bus and handling their own direct data transfers with system memory.
The benefit: Because the CPU is entirely removed from the data movement process, system efficiency is drastically improved. Furthermore, a block of data can often be moved in a single, highly efficient bus cycle.

3. Peer-to-Peer Transfers: A Great Idea in Theory

Because most modern PCI devices are capable of acting as Bus Masters, an interesting third option exists: peer-to-peer transfers.

How it works: One PCI Bus Master can initiate a direct data transfer to another PCI device.
The theoretical benefit: Because the entire transaction takes place between two “peers” directly on the PCI bus, it does not involve the CPU or system memory. This leaves the rest of the system completely free to perform other work.
The practical reality: Despite its obvious efficiencies, true peer-to-peer communication is rarely used in practice. The main hurdle is that the initiator and the target device rarely use the exact same data format unless they are manufactured by the same vendor. Consequently, the data usually must be routed to system memory first so the CPU can reformat it before it is finally sent to the target device, entirely defeating the purpose of a peer-to-peer transfer.

P3. PCI Transaction Models: Programmed I/O, DMA, and Peer-to-Peer

Leave a Comment Cancel Reply