P4.2 Base Address Registers (BARs): Negotiating Device Memory and IO Space

To function properly, a PCIe device must allow the system’s software to read and write to its internal registers and storage locations. However, in a plug-and-play architecture like PCIe, a device cannot simply demand or assume a specific address on its own; the system software (like the BIOS or OS kernel) acts as the ultimate authority that manages and allocates the system’s memory and IO maps.

To negotiate this setup, devices use Base Address Registers (BARs) located within their configuration space headers. Standard Endpoints (Type 0 headers) have six 32-bit BARs available, while Bridge devices (Type 1 headers) have two.

Here is a look at how device designers use BARs to communicate their hardware’s requirements, and how system software programs them.

The Hardware Designer’s Role: Requesting Space

The process begins with the hardware designer. The designer knows exactly how much address space the device’s internal functions require, as well as how the hardware behaves when those addresses are read. Based on this, the designer must tell the system what type and size of address space to allocate.

The designer accomplishes this by hard-coding the lowest bits of the BAR:

  • Type of Space: The very lowest bits indicate whether the device is requesting Memory-Mapped IO (MMIO) or legacy IO space, whether the memory is Prefetchable (reads have no side-effects) or Non-Prefetchable (reads do have side-effects), and whether the device supports a 32-bit or 64-bit address.
  • Size of Space: The designer also hard-codes a specific number of the lower bits to 0 to represent the overall size of the address block the device needs.

Because these lower bits are hard-coded into the silicon, they are strictly read-only and cannot be altered by the system software.

The Software’s Role: A Three-Step Programming Process

During system boot (enumeration), the configuration software must evaluate every BAR to determine what the device wants, and then assign it a specific location in the system map. This is done through a strict three-step process:

1. Write All 1s First, the software writes all 1s to the entire 32-bit BAR. Because the designer hard-coded the lower bits to indicate size and type, those specific bits will simply ignore the write and retain their hard-coded values. Only the upper bits will successfully flip to 1s.

2. Read the BAR Next, the software reads the value of the BAR back.

  • By looking at the lowest bits, software determines if the request is for Memory or IO, and what specific type of memory it is.
  • By looking for the least-significant writable bit, software determines the size of the block being requested. For example, if bit 12 is the first bit that successfully accepted a 1, the software knows the device is requesting a block size of 212, which equals 4KB of address space.

3. Write the Base Address Now that the software knows the exact size and type of the requested address space, it finds an available block in the system’s memory or IO map. Finally, it writes the starting address (the Base Address) of that newly allocated block directly into the upper writable bits of the BAR.

Once this base address is written and memory decoding is enabled, the setup is complete. The device will now monitor the bus and claim any transactions that fall within its newly programmed address range.

Handling Unused BARs and 64-Bit Requests

Most devices do not actually need all six available BARs. If a BAR is not needed, the designer simply hard-codes the entire 32-bit register to all 0s. When software writes all 1s to it and reads it back, it will see all 0s, understand the BAR is unused, and simply move on to evaluate the next one.

However, system software must evaluate all BARs sequentially. A designer is not forced to use BAR0 first; they could theoretically use BAR4 for their request and leave the others hard-coded to zero. Furthermore, if a device requests a 64-bit memory address space (allowing the software to map the device above the 4GB boundary), the device must use two sequential BARs tied together to form a single 64-bit register. When the software reads the first BAR and sees a 64-bit request, it automatically knows to treat the very next BAR as the upper 32 bits of that exact same address request.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top