Chapter 5.4 – Byte Enable Mechanism in PCI Express

(Partial Writes, Rules, and Practical Examples)


1. Introduction

Every PCI Express transaction involving data — particularly Memory Write and Completion with Data packets — uses Byte Enable (BE) fields to specify which bytes within a doubleword (DW) are valid.

This feature provides fine-grained control over data transfers, allowing partial memory updates without the need for separate byte-level operations.

It’s especially vital in:

  • DMA engines (unaligned transfers)
  • Descriptor updates
  • Configuration space accesses
  • Packet-based protocols with variable-length payloads

2. What Are Byte Enables?

A Byte Enable is a 4-bit field used to indicate which bytes of a 32-bit (4-byte) word are valid during a transfer.

Each bit corresponds to one byte:

Bit 3 → Byte 3 (Most Significant)

Bit 2 → Byte 2

Bit 1 → Byte 1

Bit 0 → Byte 0 (Least Significant)

For example:

  • 1111b → all 4 bytes valid (full-word write)
  • 0001b → only least significant byte valid
  • 1100b → only upper two bytes valid

3. Where Byte Enables Are Used

Two separate Byte Enable fields exist in every TLP with data payload:

FieldMeaning
First BE (4 bits)Indicates which bytes in the first DW of the payload are valid.
Last BE (4 bits)Indicates which bytes in the last DW of the payload are valid.

This allows PCIe to perform unaligned or variable-length writes efficiently.

💡 If a packet transfers only one doubleword (DW), then First BE is used, and Last BE is ignored.


4. Why Byte Enables Are Needed

Let’s recall that PCIe transfers occur in doubleword (DW = 4 bytes) granularity.
But in reality, not all software writes are DW-aligned — consider:

  • Writing to a status byte inside a 32-bit register.
  • Updating a 16-bit control field inside a structure.
  • Writing 6 bytes of data starting from an odd address.

Without Byte Enables, each of those would require complex read-modify-write cycles.
BE fields allow hardware to directly signal which bytes in the first and last words are meaningful.


5. Byte Enable Rules

Here are the official PCIe rules governing BE field usage:

Rule #Description
1At least one byte must be enabled across First BE and Last BE.
2If the TLP has a single DW payload, Last BE is ignored.
3For multi-DW payloads, intermediate DWords (between first and last) always assume all bytes valid (1111b).
4The First BE and Last BE together define the valid byte range of the entire payload.
5Byte enables apply only to TLPs with data payloads (writes and completions).
6Byte enables are not used in posted read requests — those simply specify data length.

6. Address Alignment and BE Usage

PCIe requires that the address field in the header be aligned to a doubleword boundary — meaning address bits [1:0] are always 0.

Therefore, the combination of First BE, Last BE, and Length fields indicates the true byte-level range.

Example:

Address = 0x0000_1000

First BE = 0111b

Last BE  = 1100b

Length   = 3 DW (12 bytes)

This means:

  • First DW → bytes [1:3] valid
  • Middle DW → all 4 bytes valid
  • Last DW → bytes [2:3] valid
    → Total bytes transferred = 10 bytes (not full 12).

7. Real Example – Unaligned 6-Byte Write

Suppose software writes 6 bytes starting at address 0x0000_0002.

Steps:

  1. The address is aligned down to 0x0000_0000 (DW-aligned).
  2. The Transaction Layer sets:
    • Length = 2 DW
    • First BE = 1100b (bytes 2–3 valid in first DW)
    • Last BE = 0011b (bytes 0–1 valid in second DW)
  3. Payload carries 8 bytes of data, but only 6 are valid.
  4. The target uses BE to update only those bytes.

🔍 Result: 6-byte write across two 32-bit words without misalignment errors.


8. Special Case – Completion with Data (CplD)

When a device sends Completion with Data (CplD) packets for a read request, the same BE rules apply:

  • First BE and Last BE indicate which bytes within the returned data are valid.
  • The requester uses them to extract meaningful bytes from the response.

This is essential for reads that don’t start and end on DW boundaries.


9. Byte Enables in Hardware Implementation

From a design point of view:

  • Tx Path (TLP Generation):
    Byte enables are computed from address offset and transfer length.
    Simple combinational logic or look-up tables are used.

Rx Path (TLP Decode):
Byte enables control write strobes at the memory interface.
Example:

assign write_strobe = first_be & dw_mask;

  • Verification (UVM):
    Testbenches randomize address and length to verify BE correctness.

10. Debugging Tip

If a memory write isn’t updating expected bytes:

  • Check First/Last BE values in the analyzer trace.
  • Mismatch between address offset and BE bits often indicates incorrect DW alignment or TLP formation.

11. Points to be remembered

  • Byte Enables provide byte-level precision within doubleword-aligned PCIe packets.
  • They ensure unaligned or partial writes don’t require software-side read-modify-write cycles.
  • Both First BE and Last BE fields work together to describe the full data span.
  • Proper BE logic is crucial for DMA engines, bridges, and configuration space updates.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top