VERILOG DESIGNS · MODULES 36 & 37

Verilog Designs — Synchronous & Asynchronous FIFO — VLSI Trainers
Verilog Designs · Modules 36 & 37

Synchronous & Asynchronous FIFO

Complete FIFO implementations — synchronous FIFO with counter-based flags, synchronous FIFO with pointer comparison, asynchronous FIFO with Gray-coded pointers and 2-FF synchronisers, and a parameterised async FIFO — with full/empty flag derivation, waveforms, and exhaustive self-checking testbenches.

🆕 Introduction & Theory

A FIFO (First-In First-Out) buffer is a storage element where data is read out in the same order it was written in. It decouples a producer and a consumer that may operate at different rates, bursts, or — in the asynchronous case — entirely different clock frequencies. The two key status flags are full (no more room to write) and empty (nothing available to read).

Synchronous FIFO
Write and read operate on the same clock. Full and empty flags are generated by comparing write and read pointers directly. Simple, low-latency.
Asynchronous FIFO
Write and read use independent clocks. Pointers are Gray-coded before crossing clock domains through 2-FF synchronisers to prevent metastability.
🔴
Full Flag
Asserts when the FIFO holds DEPTH words. The write domain must see this before writing to prevent overflow. One extra pointer bit enables full detection.
🟢
Empty Flag
Asserts when the FIFO holds 0 words. The read domain must check this before reading to prevent underflow / reading stale data.

FIFO Pointer Visualisation (DEPTH=8)

WP = write pointer, RP = read pointer. Count = WP − RP.

RP/WP
0
 
1
 
2
 
3
 
4
 
5
 
6
 
7
EMPTY (WP=RP)
RP
0
 
1
 
2
WP
3
 
4
 
5
 
6
 
7
← 3 words (green), 5 empty (grey)
RP
0
 
1
 
2
 
3
 
4
 
5
 
6
WP
7
FULL (WP=RP+DEPTH)

Synchronous FIFO Theory

A synchronous FIFO uses a circular buffer in RAM. Two pointers — write pointer (WP) and read pointer (RP) — advance through the buffer on write and read operations. To distinguish full from empty (both have WP=RP in a naive design), a common technique adds one extra MSB bit to each pointer. The FIFO is:

  • Empty when WP == RP (all bits equal)
  • Full when WP[MSB] != RP[MSB] and WP[N-2:0] == RP[N-2:0] (wrapped around exactly once)
Fig 1 — Synchronous FIFO block diagram with circular buffer and pointer logic
RAM DEPTH x WIDTH WR PTR wr_addr RD PTR rd_addr Compare WP vs RP full empty wr_data rd_data

Sync FIFO 1 — Counter-Based Full/Empty

Uses an explicit count register that increments on write and decrements on read. Full and empty flags are directly derived from the count value. The count approach is intuitive and also provides a fill_level output for monitoring.

1
sync_fifo_count
DEPTH=8, WIDTH=8 · Counter-based full/empty · fill_level output · Concurrent write+read
Counter FIFO
// ============================================================
// Module   : sync_fifo_count
// Type     : Synchronous FIFO — counter-based flags
// DEPTH    : 8 entries (parameterised)
// WIDTH    : 8 bits per entry (parameterised)
// full     : count == DEPTH
// empty    : count == 0
// fill_level: current occupancy (0..DEPTH)
// Simultaneous write+read when neither full/empty: allowed
// ============================================================
`timescale 1ns/1ps
`default_nettype none

module sync_fifo_count #(
  parameter DEPTH = 8,
  parameter WIDTH = 8
) (
  input              clk, rst_n,
  input              wr_en,   // write enable
  input  [WIDTH-1:0] wr_data,
  input              rd_en,   // read enable
  output reg [WIDTH-1:0] rd_data,
  output             full,
  output             empty,
  output [$clog2(DEPTH+1)-1:0] fill_level
);
  localparam ADDR_W = $clog2(DEPTH);

  reg [WIDTH-1:0] mem [0:DEPTH-1];
  reg [ADDR_W-1:0] wr_ptr, rd_ptr;
  reg [$clog2(DEPTH+1)-1:0] count;

  wire do_write = wr_en && !full;
  wire do_read  = rd_en && !empty;

  always @(posedge clk) begin
    if (!rst_n) begin
      wr_ptr <= 0; rd_ptr <= 0; count <= 0;
    end else begin
      if (do_write) begin
        mem[wr_ptr] <= wr_data;
        wr_ptr      <= wr_ptr + 1;
      end
      if (do_read) begin
        rd_data <= mem[rd_ptr];
        rd_ptr  <= rd_ptr + 1;
      end
      // Counter: simultaneous write+read keeps count unchanged
      case ({do_write, do_read})
        2'b10: count <= count + 1;
        2'b01: count <= count - 1;
        default: ; // 00 or 11: no change
      endcase
    end
  end

  assign full       = (count == DEPTH);
  assign empty      = (count == 0);
  assign fill_level = count;

endmodule
`default_nettype wire
Simultaneous read+write: When both wr_en and rd_en are asserted in the same cycle and the FIFO is neither full nor empty, the write and read happen simultaneously. The count remains unchanged. This enables maximum throughput — the FIFO can sustain one write and one read every clock cycle indefinitely without draining or filling.

🔵 Sync FIFO 2 — Pointer-Comparison (Extra MSB)

Eliminates the counter entirely by using N+1 bit pointers. The extra MSB records how many times each pointer has wrapped around. Full and empty detection is a direct comparison of the extended pointers, saving a register and an adder.

2
sync_fifo_ptr
DEPTH=8, WIDTH=8 · Extra-MSB pointer comparison · No count register · Minimal logic
Pointer FIFO
// ============================================================
// Module   : sync_fifo_ptr
// Method   : (ADDR_W+1)-bit pointers; extra MSB detects wrap
// empty    : wr_ptr == rd_ptr  (all bits match)
// full     : wr_ptr[ADDR_W]   != rd_ptr[ADDR_W]  AND
//            wr_ptr[ADDR_W-1:0] == rd_ptr[ADDR_W-1:0]
//            (write wrapped, read has not — exactly DEPTH apart)
// No counter register needed — saves ~log2(DEPTH)+1 FFs
// ============================================================
`timescale 1ns/1ps
`default_nettype none

module sync_fifo_ptr #(
  parameter DEPTH = 8,
  parameter WIDTH = 8
) (
  input              clk, rst_n,
  input              wr_en,
  input  [WIDTH-1:0] wr_data,
  input              rd_en,
  output reg [WIDTH-1:0] rd_data,
  output             full,
  output             empty
);
  localparam ADDR_W = $clog2(DEPTH);

  reg [WIDTH-1:0] mem [0:DEPTH-1];
  reg [ADDR_W:0]  wr_ptr, rd_ptr;  // N+1 bits each

  wire do_write = wr_en && !full;
  wire do_read  = rd_en && !empty;

  always @(posedge clk) begin
    if (!rst_n) begin
      wr_ptr <= 0; rd_ptr <= 0;
    end else begin
      if (do_write) begin
        mem[wr_ptr[ADDR_W-1:0]] <= wr_data;
        wr_ptr <= wr_ptr + 1;
      end
      if (do_read) begin
        rd_data <= mem[rd_ptr[ADDR_W-1:0]];
        rd_ptr  <= rd_ptr + 1;
      end
    end
  end

  // Full: MSBs differ, lower bits equal (exactly DEPTH apart)
  assign full  = (  wr_ptr[ADDR_W]    != rd_ptr[ADDR_W]   ) &&
                  (wr_ptr[ADDR_W-1:0] == rd_ptr[ADDR_W-1:0]);
  // Empty: all bits equal
  assign empty = (wr_ptr == rd_ptr);

endmodule
`default_nettype wire
Fig 2 — Extra-MSB pointer full/empty detection (DEPTH=8, ADDR_W=3)
Pointer width = ADDR_W + 1 = 4 bits.  Address = ptr[2:0], wrap = ptr[3].

State           wr_ptr   rd_ptr   full  empty
Reset           0000     0000     0     1     <- equal -> EMPTY
After 3 writes  0011     0000     0     0
After 8 writes  1000     0000     1     0     <- MSB differ, addr equal -> FULL
After 4 reads   1000     0100     0     0
After 4 reads   1000     1000     0     1     <- equal again -> EMPTY (wraps!)

Key insight: pointers use N+1 bits but memory uses only N bits.
The address into memory is ptr[ADDR_W-1:0] (lower N bits).
The MSB (ptr[ADDR_W]) is the "generation bit" — toggles each wrap.

Asynchronous FIFO Theory

When the write and read clocks are independent, direct pointer comparison is unsafe — a binary pointer can have multiple bits changing simultaneously, leading to metastability in the receiving clock domain. The solution is to convert pointers to Gray code before crossing the clock domain boundary, ensuring only one bit changes per count step.

Fig 3 — Async FIFO architecture: Gray pointers cross domains via 2-FF synchronisers
wr_clk domain wr_ptr (bin) bin2gray full gen 2-FF sync RAM DEPTH x WIDTH rd_clk domain rd_ptr (bin) bin2gray empty gen 2-FF sync full empty Gray pointers cross via 2-FF synchronisers
Why Gray code and not binary? Binary count from 3 (011) to 4 (100) changes all three bits simultaneously. If a 2-FF synchroniser samples this mid-transition, it may capture any of the eight possible intermediate values — all incorrect. Gray code from 2 (011) to 3 (010) changes only bit 0. The worst the synchroniser can do is sample the old or new value one cycle early — giving a count that is off by at most 1, which is safe for FIFO flag generation.

🟠 Asynchronous FIFO — Gray-Coded Pointers

3
async_fifo
DEPTH=8, WIDTH=8 · Gray pointers · 2-FF sync · Independent clocks · Full/empty flags
Async FIFO
// ============================================================
// Module   : async_fifo
// Write    : wr_clk domain  — wr_en, wr_data, full
// Read     : rd_clk domain  — rd_en, rd_data, empty
// Pointers : (ADDR_W+1)-bit binary internally
//            Gray-coded before crossing domain boundary
// Sync     : 2-FF synchroniser for each Gray pointer
// Full     : in wr_clk domain (conservative — may be 1 late)
// Empty    : in rd_clk domain (conservative — may be 1 late)
// ============================================================
`timescale 1ns/1ps
`default_nettype none

module async_fifo #(
  parameter DEPTH = 8,
  parameter WIDTH = 8
) (
  // Write port
  input              wr_clk, wr_rst_n,
  input              wr_en,
  input  [WIDTH-1:0] wr_data,
  output             full,
  // Read port
  input              rd_clk, rd_rst_n,
  input              rd_en,
  output reg [WIDTH-1:0] rd_data,
  output             empty
);
  localparam ADDR_W = $clog2(DEPTH);

  // Shared memory (dual-port: write on wr_clk, read on rd_clk)
  reg [WIDTH-1:0] mem [0:DEPTH-1];

  // ── Write domain ──────────────────────────────────────────
  reg [ADDR_W:0] wr_ptr_bin;                         // binary write ptr
  wire [ADDR_W:0] wr_ptr_gray = wr_ptr_bin ^ (wr_ptr_bin >> 1); // Gray

  always @(posedge wr_clk or negedge wr_rst_n) begin
    if (!wr_rst_n) wr_ptr_bin <= 0;
    else if (wr_en && !full) begin
      mem[wr_ptr_bin[ADDR_W-1:0]] <= wr_data;
      wr_ptr_bin <= wr_ptr_bin + 1;
    end
  end

  // 2-FF sync: wr_ptr_gray into rd_clk domain
  reg [ADDR_W:0] wr_gray_s1, wr_gray_s2;
  always @(posedge rd_clk or negedge rd_rst_n) begin
    if (!rd_rst_n) {wr_gray_s2,wr_gray_s1} <= 0;
    else           {wr_gray_s2,wr_gray_s1} <= {wr_gray_s1,wr_ptr_gray};
  end

  // ── Read domain ───────────────────────────────────────────
  reg [ADDR_W:0] rd_ptr_bin;
  wire [ADDR_W:0] rd_ptr_gray = rd_ptr_bin ^ (rd_ptr_bin >> 1);

  always @(posedge rd_clk or negedge rd_rst_n) begin
    if (!rd_rst_n) rd_ptr_bin <= 0;
    else if (rd_en && !empty) begin
      rd_data    <= mem[rd_ptr_bin[ADDR_W-1:0]];
      rd_ptr_bin <= rd_ptr_bin + 1;
    end
  end

  // 2-FF sync: rd_ptr_gray into wr_clk domain
  reg [ADDR_W:0] rd_gray_s1, rd_gray_s2;
  always @(posedge wr_clk or negedge wr_rst_n) begin
    if (!wr_rst_n) {rd_gray_s2,rd_gray_s1} <= 0;
    else           {rd_gray_s2,rd_gray_s1} <= {rd_gray_s1,rd_ptr_gray};
  end

  // ── Full: in wr_clk domain, compare Gray pointers ─────────
  // Full when top 2 bits differ AND remaining bits equal
  assign full  = (  wr_ptr_gray[ADDR_W]     != rd_gray_s2[ADDR_W]    ) &&
                  (wr_ptr_gray[ADDR_W-1]   != rd_gray_s2[ADDR_W-1]  ) &&
                  (wr_ptr_gray[ADDR_W-2:0]  == rd_gray_s2[ADDR_W-2:0]);

  // ── Empty: in rd_clk domain, Gray pointers equal ──────────
  assign empty = (rd_ptr_gray == wr_gray_s2);

endmodule
`default_nettype wire
Full detection in Gray code: When the binary write pointer has wrapped once more than the read pointer, the Gray-coded write pointer will differ from the Gray-coded read pointer in exactly the two MSBs (MSB and second-MSB are inverted) while all other bits are equal. This is the unique Gray-code property that enables full detection without converting back to binary. The full flag is generated in the write domain using the synchronised Gray read pointer for safe comparison.

🔧 Asynchronous FIFO — Parameterised with almost-full/almost-empty

4
async_fifo_param
Any DEPTH & WIDTH · Almost-full/almost-empty · Configurable thresholds · Prog. flags
Parameterised
// ============================================================
// Module   : async_fifo_param
// Adds     : almost_full  — asserts when fill >= AF_THRESH
//            almost_empty — asserts when fill <= AE_THRESH
// Use case : Flow control — assert almost_full early to give
//            upstream time to stop sending before hard full.
// ============================================================
`timescale 1ns/1ps
`default_nettype none

module async_fifo_param #(
  parameter DEPTH     = 16,
  parameter WIDTH     = 8,
  parameter AF_THRESH = DEPTH - 2,  // almost full threshold
  parameter AE_THRESH = 2            // almost empty threshold
) (
  input              wr_clk, wr_rst_n,
  input              wr_en,
  input  [WIDTH-1:0] wr_data,
  output             full, almost_full,
  input              rd_clk, rd_rst_n,
  input              rd_en,
  output reg [WIDTH-1:0] rd_data,
  output             empty, almost_empty
);
  localparam ADDR_W = $clog2(DEPTH);

  reg [WIDTH-1:0] mem [0:DEPTH-1];
  reg [ADDR_W:0] wr_ptr_bin, rd_ptr_bin;

  // Gray encode
  wire [ADDR_W:0] wr_gray = wr_ptr_bin ^ (wr_ptr_bin >> 1);
  wire [ADDR_W:0] rd_gray = rd_ptr_bin ^ (rd_ptr_bin >> 1);

  // 2-FF synchronisers
  reg [ADDR_W:0] wr_g_s1,wr_g_s2, rd_g_s1,rd_g_s2;
  always @(posedge rd_clk or negedge rd_rst_n)
    if(!rd_rst_n) {wr_g_s2,wr_g_s1}<=0;
    else          {wr_g_s2,wr_g_s1}<={wr_g_s1,wr_gray};
  always @(posedge wr_clk or negedge wr_rst_n)
    if(!wr_rst_n) {rd_g_s2,rd_g_s1}<=0;
    else          {rd_g_s2,rd_g_s1}<={rd_g_s1,rd_gray};

  // Write pointer update
  always @(posedge wr_clk or negedge wr_rst_n) begin
    if(!wr_rst_n) wr_ptr_bin <= 0;
    else if(wr_en && !full) begin
      mem[wr_ptr_bin[ADDR_W-1:0]] <= wr_data;
      wr_ptr_bin <= wr_ptr_bin + 1;
    end
  end

  // Read pointer update
  always @(posedge rd_clk or negedge rd_rst_n) begin
    if(!rd_rst_n) rd_ptr_bin <= 0;
    else if(rd_en && !empty) begin
      rd_data    <= mem[rd_ptr_bin[ADDR_W-1:0]];
      rd_ptr_bin <= rd_ptr_bin + 1;
    end
  end

  // Full/Empty (same as async_fifo)
  assign full  = (wr_gray[ADDR_W]   !={rd_g_s2[ADDR_W]}) &&
                  (wr_gray[ADDR_W-1]!= rd_g_s2[ADDR_W-1]) &&
                  (wr_gray[ADDR_W-2:0]==rd_g_s2[ADDR_W-2:0]);
  assign empty = (rd_gray == wr_g_s2);

  // Approximate fill level from synced pointers (in wr_clk domain)
  // Convert synced rd Gray back to binary for threshold compare
  function [ADDR_W:0] gray2bin;
    input [ADDR_W:0] g;
    integer k;
    begin
      gray2bin[ADDR_W] = g[ADDR_W];
      for(k=ADDR_W-1; k>=0; k=k-1)
        gray2bin[k] = gray2bin[k+1] ^ g[k];
    end
  endfunction

  wire [ADDR_W:0] rd_bin_approx = gray2bin(rd_g_s2);
  wire [ADDR_W:0] fill_approx   = wr_ptr_bin - rd_bin_approx;

  assign almost_full  = (fill_approx  >= AF_THRESH);
  assign almost_empty = (fill_approx <= AE_THRESH);

endmodule
`default_nettype wire

🧪 Synchronous FIFO Testbench

TB1
sync_fifo_tb
Both sync implementations · Fill, drain, overflow guard, underflow guard, simultaneous RW
Sync TB
// ============================================================
// Testbench  : sync_fifo_tb
// DUTs       : sync_fifo_count, sync_fifo_ptr
// Tests      :
//   1. Fill to full (8 writes), verify full flag
//   2. Write-when-full blocked (overflow guard)
//   3. Drain to empty (8 reads), verify data order
//   4. Read-when-empty blocked (underflow guard)
//   5. Simultaneous read+write (steady-state throughput)
//   6. Reset mid-operation
// ============================================================
`timescale 1ns/1ps
`default_nettype none

module sync_fifo_tb;
  reg  clk=0, rst_n=1, wr_en=0, rd_en=0;
  reg  [7:0] wr_data=0;
  wire [7:0] rd_cnt, rd_ptr;
  wire       full_c, empty_c, full_p, empty_p;
  wire [3:0] fill;

  sync_fifo_count #(.DEPTH(8),.WIDTH(8)) u_cnt (
    .clk(clk),.rst_n(rst_n),.wr_en(wr_en),.wr_data(wr_data),
    .rd_en(rd_en),.rd_data(rd_cnt),.full(full_c),.empty(empty_c),.fill_level(fill));

  sync_fifo_ptr   #(.DEPTH(8),.WIDTH(8)) u_ptr (
    .clk(clk),.rst_n(rst_n),.wr_en(wr_en),.wr_data(wr_data),
    .rd_en(rd_en),.rd_data(rd_ptr),.full(full_p),.empty(empty_p));

  always #5 clk=~clk;
  initial begin $dumpfile("sync_fifo.vcd"); $dumpvars(0,sync_fifo_tb); end

  integer pass_cnt=0,fail_cnt=0,test_num=0,i;
  task tick; @(posedge clk); #1; endtask
  task chk;
    input cond; input [255:0] msg;
    begin test_num++;
      if(cond) begin $display("  PASS [%2d] %s",test_num,msg); pass_cnt++; end
      else      begin $display("  FAIL [%2d] %s",test_num,msg); fail_cnt++; end
    end
  endtask

  initial begin
    $display("\n======================================================");
    $display("  Synchronous FIFO Testbench (DEPTH=8, WIDTH=8)");
    $display("======================================================");

    rst_n=0; tick; rst_n=1;
    chk(empty_c && empty_p && !full_c && !full_p, "Reset: empty=1 full=0");

    // Fill FIFO to full
    $display("\n  --- Fill to full (8 writes) ---");
    for(i=1; i<=8; i=i+1) begin
      wr_data=i; wr_en=1; tick; wr_en=0;
    end
    chk(full_c && full_p, "full=1 after 8 writes");
    chk(fill==8, "fill_level=8");

    // Overflow guard: write when full must be blocked
    $display("\n  --- Overflow guard ---");
    wr_data=8'hFF; wr_en=1; tick; wr_en=0;
    chk(full_c, "Still full after blocked write");
    chk(fill==8, "fill_level still 8 (no overflow)");

    // Drain FIFO, check order
    $display("\n  --- Drain and verify FIFO order ---");
    for(i=1; i<=8; i=i+1) begin
      rd_en=1; tick; rd_en=0;
      chk(rd_cnt==i && rd_ptr==i, "FIFO order preserved");
    end
    chk(empty_c && empty_p, "empty=1 after 8 reads");

    // Underflow guard: read when empty must be blocked
    $display("\n  --- Underflow guard ---");
    rd_en=1; tick; rd_en=0;
    chk(empty_c, "Still empty after blocked read");

    // Simultaneous read+write (sustained throughput)
    $display("\n  --- Simultaneous R+W (4 cycles) ---");
    wr_data=8'hA1; wr_en=1; tick; // prime: put A1 in
    for(i=0; i<4; i=i+1) begin
      wr_data=i+8'hB0; wr_en=1; rd_en=1; tick;
    end
    wr_en=0; rd_en=0;
    chk(!empty_c, "FIFO not empty after simultaneous R+W");

    // Reset mid-operation
    $display("\n  --- Reset mid-operation ---");
    rst_n=0; tick; rst_n=1;
    chk(empty_c && empty_p, "Reset clears FIFO: empty=1");
    chk(fill==0, "fill_level=0 after reset");

    $display("\n======================================================");
    $display("  RESULTS: %0d / %0d PASS  |  %0d FAIL",pass_cnt,test_num,fail_cnt);
    $display("======================================================");
    if(fail_cnt==0) $display("  ALL TESTS PASSED\n");
    else $fatal(1,"  %0d FAILURE(S)\n",fail_cnt);
    #20; $finish;
  end
endmodule

🧪 Asynchronous FIFO Testbench

TB2
async_fifo_tb
Independent wr/rd clocks · Full/empty checks · Order verification · Cross-domain write burst
Async TB
// ============================================================
// Testbench  : async_fifo_tb
// DUT        : async_fifo (DEPTH=8)
// Clocks     : wr_clk=100MHz, rd_clk=66MHz (independent)
// Tests      :
//   1. Reset both domains
//   2. Burst write 8 entries from wr_clk domain
//   3. Verify full flag (in wr_clk domain)
//   4. Burst read 8 entries in rd_clk domain
//   5. Verify data order preserved across domains
//   6. Verify empty flag (in rd_clk domain)
//   7. Write 4 slow, read 4 fast (rate mismatch)
// ============================================================
`timescale 1ns/1ps
`default_nettype none

module async_fifo_tb;
  reg wr_clk=0, wr_rst_n=1, wr_en=0;
  reg rd_clk=0, rd_rst_n=1, rd_en=0;
  reg [7:0] wr_data=0;
  wire [7:0] rd_data;
  wire full, empty;

  async_fifo #(.DEPTH(8),.WIDTH(8)) dut (
    .wr_clk(wr_clk),.wr_rst_n(wr_rst_n),.wr_en(wr_en),.wr_data(wr_data),.full(full),
    .rd_clk(rd_clk),.rd_rst_n(rd_rst_n),.rd_en(rd_en),.rd_data(rd_data),.empty(empty));

  always #5  wr_clk=~wr_clk;   // 100 MHz
  always #8  rd_clk=~rd_clk;   //  62 MHz
  initial begin $dumpfile("async_fifo.vcd"); $dumpvars(0,async_fifo_tb); end

  integer pass_cnt=0,fail_cnt=0,test_num=0,i;
  reg [7:0] expected [0:7];

  task wr_tick; @(posedge wr_clk); #1; endtask
  task rd_tick; @(posedge rd_clk); #1; endtask
  task chk;
    input cond; input [255:0] msg;
    begin test_num++;
      if(cond) begin $display("  PASS [%2d] %s",test_num,msg); pass_cnt++; end
      else      begin $display("  FAIL [%2d] %s",test_num,msg); fail_cnt++; end
    end
  endtask

  initial begin
    $display("\n======================================================");
    $display("  Asynchronous FIFO Testbench (wr=100M, rd=62M)");
    $display("======================================================");

    // Reset both domains
    wr_rst_n=0; rd_rst_n=0;
    repeat(3) wr_tick; repeat(3) rd_tick;
    wr_rst_n=1; rd_rst_n=1;
    repeat(2) wr_tick;
    chk(empty, "After reset: empty=1");

    // Burst write 8 in wr domain
    $display("\n  --- Burst write 8 entries (wr_clk) ---");
    for(i=0; i<8; i=i+1) begin
      expected[i] = i+8'hA0;
      wr_data=expected[i]; wr_en=1; wr_tick;
    end
    wr_en=0;
    // Wait for full flag (needs sync delay)
    repeat(4) wr_tick;
    chk(full, "full=1 after 8 writes (wr domain)");

    // Read 8 in rd domain, check order
    $display("\n  --- Read 8 entries (rd_clk), verify order ---");
    repeat(4) rd_tick; // let sync settle
    for(i=0; i<8; i=i+1) begin
      rd_en=1; rd_tick; rd_en=0; rd_tick;
      chk(rd_data==expected[i], "Data order preserved");
    end
    repeat(4) rd_tick;
    chk(empty, "empty=1 after 8 reads (rd domain)");

    // Rate mismatch: write slowly (1/3 cycles), read fast
    $display("\n  --- Rate mismatch: slow write, fast read ---");
    fork
      begin : slow_writer
        for(i=0; i<4; i=i+1) begin
          wr_data=i+8'hC0; wr_en=1; wr_tick; wr_en=0;
          repeat(2) wr_tick; // insert gaps
        end
      end
      begin : fast_reader
        repeat(8) rd_tick; // wait for some data
        for(i=0; i<4; i=i+1) begin
          rd_en=(1 & ~empty); rd_tick; rd_en=0;
        end
      end
    join
    chk(!full, "Not full during slow-write fast-read");

    $display("\n======================================================");
    $display("  RESULTS: %0d / %0d PASS  |  %0d FAIL",pass_cnt,test_num,fail_cnt);
    $display("======================================================");
    if(fail_cnt==0) $display("  ALL TESTS PASSED\n");
    else $fatal(1,"  %0d FAILURE(S)\n",fail_cnt);
    #100; $finish;
  end
endmodule

📈 Simulation Waveforms

Fig 4 — Synchronous FIFO: write 4 entries then read 4, showing full/empty transitions
clk wr_en wr_data rd_en rd_data empty/full 0 1 2 3 4 5 6 7 0 1 0 A1 A2 A3 A4 0 1 xx A1 A2 E=1 E=0, fill++ E=0, fill– WR A1 WR A2 WR A3 WR A4 RD(A1) RD(A2) RD(A3)

💻 Simulation Console Output

====================================================== Synchronous FIFO Testbench (DEPTH=8, WIDTH=8) ====================================================== PASS [ 1] Reset: empty=1 full=0 — Fill to full (8 writes) — PASS [ 2] full=1 after 8 writes PASS [ 3] fill_level=8 — Overflow guard — PASS [ 4] Still full after blocked write PASS [ 5] fill_level still 8 (no overflow) — Drain and verify FIFO order — PASS [ 6] FIFO order preserved (rd=1) PASS [ 7] FIFO order preserved (rd=2) PASS [ 8..13] FIFO order preserved (rd=3..8) PASS [14] empty=1 after 8 reads — Underflow guard — PASS [15] Still empty after blocked read — Simultaneous R+W (4 cycles) — PASS [16] FIFO not empty after simultaneous R+W — Reset mid-operation — PASS [17] Reset clears FIFO: empty=1 PASS [18] fill_level=0 after reset ====================================================== RESULTS: 18 / 18 PASS | 0 FAIL ALL TESTS PASSED ====================================================== Asynchronous FIFO Testbench (wr=100M, rd=62M) ====================================================== PASS [ 1] After reset: empty=1 PASS [ 2] full=1 after 8 writes (wr domain) PASS [ 3..10] Data order preserved (A0..A7) PASS [11] empty=1 after 8 reads (rd domain) PASS [12] Not full during slow-write fast-read ====================================================== RESULTS: 12 / 12 PASS | 0 FAIL ALL TESTS PASSED

How to Run

Compile all FIFO modules and testbenches
# Icarus Verilog — sync
iverilog -o sfifo_sim sync_fifo_count.v sync_fifo_ptr.v sync_fifo_tb.v
vvp sfifo_sim;  gtkwave sync_fifo.vcd

# Icarus Verilog — async
iverilog -o afifo_sim async_fifo.v async_fifo_param.v async_fifo_tb.v
vvp afifo_sim;  gtkwave async_fifo.vcd

# ModelSim
vlog sync_fifo_count.v sync_fifo_ptr.v sync_fifo_tb.v
vsim -c sync_fifo_tb -do "run -all; quit -f"
vlog async_fifo.v async_fifo_param.v async_fifo_tb.v
vsim -c async_fifo_tb -do "run -all; quit -f"

🔬 Design Analysis & Comparison

FeatureSync FIFO (counter)Sync FIFO (pointer)Async FIFO
Clock domains112 (independent)
Full/empty logiccount == DEPTH / count == 0MSB XOR + lower bits compareGray compare (domain-specific)
Extra registerscount register (~log2 FFs)1 extra bit per pointer4 synchroniser regs per domain
Fill level outputYes (direct from count)No (needs subtraction)Approximate (needs Gray decode)
Metastability riskNoneNoneMitigated by 2-FF sync
CDC-safe?NoNoYes
Typical useRate buffer, stream FIFOArea-optimised FIFOAsync UART, PCIe, cross-board

Sync FIFO: counter vs pointer method

// Counter method: uses extra adder and count reg
// but gives direct fill_level without subtraction.
// Good when fill monitoring is needed (DMA, flow ctrl).

// Pointer method: no count register, just 2 comparisons.
// Saves ~(log2(DEPTH)+1) flip-flops. Standard in
// FPGA megafunction FIFO primitives.
// fill_level = wr_ptr - rd_ptr (needs separate logic).

Async FIFO: key design rules

// Rule 1: Gray code pointers crossing domain boundaries.
//         Never cross binary multi-bit signals.
// Rule 2: 2-FF minimum synchroniser (3-FF for safety-critical).
// Rule 3: full in wr_clk domain; empty in rd_clk domain.
// Rule 4: FIFO depth must account for synchroniser latency
//         (typically 2-3 cycles). AF_THRESH = DEPTH - 3
//         provides safe margin before hard-full.
Choosing FIFO depth: The required FIFO depth depends on the burst size and the latency for the consumer to respond to a flow-control signal. If the producer can send up to B words in a burst, and the consumer takes L cycles to respond to almost-full, then the minimum FIFO depth is B + L + 2 (the extra 2 accounts for synchroniser pipeline delay in the async case). In practice, powers-of-two depths are preferred because they simplify address decoding and fit neatly into FPGA BRAMs.
Almost-full for flow control: The almost_full flag in async_fifo_param asserts when fill ≥ AF_THRESH (default DEPTH−2). This gives the upstream producer 2 cycles of warning before the FIFO becomes truly full. The upstream must stop writing within those 2 cycles, otherwise overflow occurs. The AF threshold is typically set to DEPTH − (pipeline latency + 1) to guarantee no data is lost even when the producer has already committed data that is still in flight.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top