Verilog Designs · Module 31

Verilog Designs — PIPO Shift Register — VLSI Trainers
Verilog Designs · Module 31

PIPO Shift Register

Complete Parallel-In Parallel-Out (PIPO) shift register — four implementations: basic 4-bit, with clock enable, bidirectional, and universal (all four shift register modes in one) — with function tables, circuit diagrams, waveforms, and an exhaustive self-checking testbench.

Introduction & Shift Register Types

A shift register is a cascade of flip-flops where data propagates from one stage to the next on each clock edge. Shift registers are classified by how data enters and exits — serially (one bit at a time) or in parallel (all bits simultaneously). PIPO is the simplest mode: all bits load at once and all bits read out at once.

PIPO
In: Parallel (N bits)
Out: Parallel (N bits)
SISO
In: Serial (1 bit)
Out: Serial (1 bit)
SIPO
In: Serial (1 bit)
Out: Parallel (N bits)
PISO
In: Parallel (N bits)
Out: Serial (1 bit)
📥
Parallel Load
All N input bits are captured simultaneously at a single clock edge. The register holds the value until the next load or shift.
📤
Parallel Output
All N stored bits are available at the output continuously. No shifting required to read — instant full-word access.
🔄
Shift Capability
PIPO can also shift left or right when no load is asserted, enabling use as a delay line or serial-to-parallel buffer.
🛠
Applications
Pipeline registers, bus latches, data alignment buffers, control word registers, parallel data retiming.

📋 PIPO Theory & Operation

In PIPO mode, a load signal gates the parallel input data into all flip-flops simultaneously. Between loads, the register holds its current value. Optionally, a shift enable can move data left or right through the flip-flop chain.

Function Table (4-bit PIPO with shift)

rst_nloadshift_endirQ[3:0] (next)Operation
0xxx0000Synchronous reset
11xxd_in[3:0]Parallel load (PIPO)
1011{Q[2:0],0}Shift left (MSB←LSB direction)
1010{0,Q[3:1]}Shift right (LSB→MSB direction)
100xQ[3:0]Hold — no change

PIPO as a Parallel Register (Hold Mode)

d_in
DFF[3]
Q[3]
DFF[2]
Q[2]
DFF[1]
Q[1]
DFF[0]
Q[0]
q_out

In PIPO mode each DFF receives its corresponding d_in bit directly (not chained). In shift mode, each DFF receives its neighbour’s output. The load signal selects between these two paths using a MUX at each DFF’s D input.

🔌 Circuit Diagram

Fig 1 — 4-bit PIPO: each DFF has a MUX selecting parallel-in or shift chain
DFF[3] DFF[2] DFF[1] DFF[0] d[3] d[2] d[1] d[0] q[3] q[2] q[1] q[0] load=1: each DFF gets its own d_in bit (parallel) load=0, shift=1: data propagates along dashed chain clk

Implementation 1 — Basic PIPO

The simplest PIPO: synchronous reset, parallel load. On each rising clock edge, when load=1 the entire input word d_in is captured into the register. Between loads the register holds its value.

1
pipo_basic
4-bit · Sync reset · Parallel load · Hold
Basic PIPO
// ============================================================
// Module   : pipo_basic
// Function : 4-bit Parallel-In Parallel-Out register
// Operation: load=1 -> q <= d_in  (parallel capture)
//            load=0 -> q holds    (no change)
// ============================================================
`timescale 1ns/1ps
`default_nettype none

module pipo_basic (
  input           clk,
  input           rst_n,    // synchronous active-low reset
  input           load,    // 1 = capture d_in, 0 = hold
  input  [3:0]   d_in,    // parallel data input
  output reg [3:0] q        // parallel data output
);

  always @(posedge clk) begin
    if      (!rst_n) q <= 4'b0000;  // synchronous reset
    else if (load)   q <= d_in;     // parallel load
    // else: hold (implicit)
  end

endmodule
`default_nettype wire
PIPO is simply a parallel register: Without any shift operation, a PIPO is identical to a standard D flip-flop register. The “shift register” aspect becomes relevant when data also needs to move through the stages. The basic PIPO is the building block for all pipeline stages — every pipeline register in a processor is a PIPO register.

🔵 Implementation 2 — With Clock Enable and Shift

Extends the basic PIPO with a clock enable and a shift-right capability. When load=0 and shift_en=1, data shifts one position right each cycle. The ser_in pin feeds the vacated MSB position.

2
pipo_shift
4-bit · Parallel load · Shift-right · Serial input · Clock enable
PIPO + Shift
// ============================================================
// Module   : pipo_shift
// Priority : rst_n > load > shift_en > hold
// Shift    : right-shift: q <= {ser_in, q[3:1]}
// ============================================================
`timescale 1ns/1ps
`default_nettype none

module pipo_shift (
  input           clk,
  input           rst_n,
  input           load,     // parallel load (priority over shift)
  input           shift_en, // enable shift when load=0
  input           ser_in,   // serial input (fills MSB on right-shift)
  input  [3:0]   d_in,
  output reg [3:0] q,
  output          ser_out   // serial output = LSB shifted out
);

  always @(posedge clk) begin
    if      (!rst_n)   q <= 4'b0;
    else if (load)     q <= d_in;               // parallel load
    else if (shift_en) q <= {ser_in, q[3:1]};   // shift right
    // else: hold
  end

  assign ser_out = q[0];  // LSB is the serial output

endmodule
`default_nettype wire
Fig 2 — Right-shift operation: ser_in enters MSB, ser_out exits LSB each cycle
Cycle  Load  d_in   ser_in  q[3:0]   ser_out
  0     1    1011     x     1011       1
  1     0     -       0     0101       1     <- shift right: 0|101[1]
  2     0     -       0     0010       1     <- shift right: 0|010[1]
  3     0     -       1     1001       0     <- shift right: 1|001[0]
  4     0     -       0     0100       1     <- shift right: 0|100[1]

🟠 Implementation 3 — Bidirectional PIPO

Adds left-shift capability alongside right-shift, controlled by the dir signal. This creates a complete bidirectional shift register that can shift in either direction, load in parallel, or hold.

3
pipo_bidir
4-bit · Parallel load · Left-shift · Right-shift · Hold
Bidirectional
// ============================================================
// Module   : pipo_bidir
// dir=1: shift left  q <= {q[2:0], ser_in_l}  MSB exits
// dir=0: shift right q <= {ser_in_r, q[3:1]}  LSB exits
// ============================================================
`timescale 1ns/1ps
`default_nettype none

module pipo_bidir (
  input           clk, rst_n,
  input           load,      // parallel load
  input           shift_en, // enable shift
  input           dir,       // 1=left, 0=right
  input           ser_in_l, // serial input for left-shift (fills LSB)
  input           ser_in_r, // serial input for right-shift (fills MSB)
  input  [3:0]   d_in,
  output reg [3:0] q,
  output          ser_out_l, // MSB exits on left-shift
  output          ser_out_r  // LSB exits on right-shift
);

  always @(posedge clk) begin
    if (!rst_n)
      q <= 4'b0;
    else if (load)
      q <= d_in;
    else if (shift_en) begin
      if (dir)
        q <= {q[2:0], ser_in_l};  // shift left: LSB<-ser_in_l
      else
        q <= {ser_in_r, q[3:1]};  // shift right: MSB<-ser_in_r
    end
  end

  assign ser_out_l = q[3]; // MSB exits on left-shift
  assign ser_out_r = q[0]; // LSB exits on right-shift

endmodule
`default_nettype wire

🟣 Implementation 4 — Universal Shift Register

The universal shift register implements all four modes controlled by a 2-bit mode select. It covers PIPO (load), SISO shift-right, SISO shift-left, and hold — making it usable as any of the four shift register types.

4
pipo_universal
All 4 modes · 2-bit mode select · Parameterised N-bit · Complete universal register
Universal
// ============================================================
// Module   : pipo_universal
// mode[1:0] encoding:
//   00 = HOLD       (no change)
//   01 = SHIFT RIGHT (ser_in -> MSB, LSB -> ser_out)
//   10 = SHIFT LEFT  (ser_in -> LSB, MSB -> ser_out)
//   11 = PARALLEL LOAD (d_in -> q)
// ============================================================
`timescale 1ns/1ps
`default_nettype none

module pipo_universal #(parameter N = 4) (
  input              clk, rst_n,
  input  [1:0]      mode,    // 00=hold 01=shr 10=shl 11=load
  input              ser_in, // serial input (used by both shift modes)
  input  [N-1:0]  d_in,
  output reg [N-1:0] q,
  output             ser_out // MSB on left-shift, LSB on right-shift
);

  localparam [1:0]
    HOLD  = 2'b00,
    SHR   = 2'b01,  // shift right
    SHL   = 2'b10,  // shift left
    LOAD  = 2'b11;  // parallel load

  always @(posedge clk) begin
    if (!rst_n)
      q <= {N{1'b0}};
    else
      case (mode)
        HOLD: ; // no change
        SHR : q <= {ser_in, q[N-1:1]};   // right: ser_in fills MSB
        SHL : q <= {q[N-2:0], ser_in};   // left:  ser_in fills LSB
        LOAD: q <= d_in;                  // parallel load
      endcase
  end

  // ser_out follows MSB for left-shift, LSB for right-shift
  assign ser_out = (mode == SHL) ? q[N-1] : q[0];

endmodule
`default_nettype wire
Universal register mode table: Mode 00 = hold (SISO pass-through with no new input); Mode 01 = SISO/SIPO right-shift (data enters MSB serially, shifts toward LSB); Mode 10 = SISO/PISO left-shift (data enters LSB serially, shifts toward MSB); Mode 11 = PIPO parallel load. By selecting the mode and reading/driving the appropriate ports, this single module implements all four classical shift register types.

🧪 Comprehensive Testbench

The testbench verifies all four implementations simultaneously. Key tests: parallel load captures the correct word, shift-right moves bits correctly with ser_in, shift-left reverses direction, hold freezes the register, and the universal mode register correctly switches between all four modes.

TB
pipo_tb
All 4 implementations · Load, shift, hold, mode-switch · Serial I/O verification
Testbench
// ============================================================
// Testbench  : pipo_tb
// DUTs       : pipo_basic, pipo_shift, pipo_bidir, pipo_universal
// Tests      : Parallel load, shift-right, shift-left,
//              hold, mode switch, reset, serial I/O
// ============================================================
`timescale 1ns/1ps
`default_nettype none

module pipo_tb;

  reg clk=0, rst_n=1, load=0, shift_en=0, dir=0, ser_in=0;
  reg [3:0] d_in=0;
  reg [1:0] mode=2'b11;

  wire [3:0] q_basic, q_shift, q_bidir, q_univ;
  wire sout_shift, sout_univ, sout_l, sout_r;

  pipo_basic             u_b  (.clk(clk),.rst_n(rst_n),.load(load),.d_in(d_in),.q(q_basic));
  pipo_shift             u_s  (.clk(clk),.rst_n(rst_n),.load(load),.shift_en(shift_en),.ser_in(ser_in),.d_in(d_in),.q(q_shift),.ser_out(sout_shift));
  pipo_bidir             u_bd (.clk(clk),.rst_n(rst_n),.load(load),.shift_en(shift_en),.dir(dir),.ser_in_l(ser_in),.ser_in_r(ser_in),.d_in(d_in),.q(q_bidir),.ser_out_l(sout_l),.ser_out_r(sout_r));
  pipo_universal #(.N(4)) u_u  (.clk(clk),.rst_n(rst_n),.mode(mode),.ser_in(ser_in),.d_in(d_in),.q(q_univ),.ser_out(sout_univ));

  always #5 clk = ~clk;
  initial begin $dumpfile("pipo.vcd"); $dumpvars(0,pipo_tb); end

  integer pass_cnt=0, fail_cnt=0, test_num=0;

  task tick; @(posedge clk); #1; endtask

  task check_all;
    input [3:0] exp;
    input [255:0] msg;
    begin
      test_num++;
      if(q_basic===exp && q_shift===exp && q_bidir===exp && q_univ===exp) begin
        $display("  PASS [%2d] %s | q=%04b",test_num,msg,q_basic);
        pass_cnt++;
      end else begin
        $display("  FAIL [%2d] %s | basic=%04b shift=%04b bidir=%04b univ=%04b exp=%04b",
          test_num,msg,q_basic,q_shift,q_bidir,q_univ,exp);
        fail_cnt++;
      end
    end
  endtask

  task check_shift;
    input [3:0] exp_shift, exp_univ;
    input [255:0] msg;
    begin
      test_num++;
      if(q_shift===exp_shift && q_univ===exp_univ) begin
        $display("  PASS [%2d] %s | q_shift=%04b q_univ=%04b",test_num,msg,q_shift,q_univ);
        pass_cnt++;
      end else begin
        $display("  FAIL [%2d] %s | shift=%04b univ=%04b exp_s=%04b exp_u=%04b",
          test_num,msg,q_shift,q_univ,exp_shift,exp_univ);
        fail_cnt++;
      end
    end
  endtask

  initial begin
    $display("\n======================================================");
    $display("  PIPO Shift Register Testbench");
    $display("======================================================");

    // Reset
    $display("\n  --- Phase 1: Reset ---");
    rst_n=0; mode=2'b11; tick;
    check_all(4'b0000, "Reset -> q=0000");

    // Parallel load
    $display("\n  --- Phase 2: Parallel Load ---");
    rst_n=1; load=1; mode=2'b11; d_in=4'b1011; tick;
    check_all(4'b1011, "Load 1011");
    d_in=4'b1100; tick;
    check_all(4'b1100, "Load 1100");
    d_in=4'b0000; tick;
    check_all(4'b0000, "Load 0000");
    d_in=4'b1111; tick;
    check_all(4'b1111, "Load 1111");

    // Hold
    $display("\n  --- Phase 3: Hold (load=0, shift=0) ---");
    load=0; shift_en=0; mode=2'b00; tick;
    check_all(4'b1111, "Hold: q unchanged");
    tick; check_all(4'b1111, "Hold: q unchanged");

    // Shift right: load 1011 then shift right 4 times with ser_in=0
    $display("\n  --- Phase 4: Right Shift (ser_in=0) ---");
    load=1; mode=2'b11; d_in=4'b1011; tick;
    load=0; shift_en=1; mode=2'b01; ser_in=0;
    tick; check_shift(4'b0101,4'b0101,"SHR 1011->0101");
    tick; check_shift(4'b0010,4'b0010,"SHR 0101->0010");
    tick; check_shift(4'b0001,4'b0001,"SHR 0010->0001");
    tick; check_shift(4'b0000,4'b0000,"SHR 0001->0000");

    // Shift right with ser_in=1
    $display("\n  --- Phase 5: Right Shift (ser_in=1) ---");
    load=1; mode=2'b11; d_in=4'b0000; tick;
    load=0; shift_en=1; mode=2'b01; ser_in=1;
    tick; check_shift(4'b1000,4'b1000,"SHR(ser=1) 0000->1000");
    tick; check_shift(4'b1100,4'b1100,"SHR(ser=1) 1000->1100");
    tick; check_shift(4'b1110,4'b1110,"SHR(ser=1) 1100->1110");
    tick; check_shift(4'b1111,4'b1111,"SHR(ser=1) 1110->1111");

    // Shift left
    $display("\n  --- Phase 6: Left Shift (ser_in=0) ---");
    load=1; mode=2'b11; d_in=4'b1011; tick;
    load=0; shift_en=1; dir=1; mode=2'b10; ser_in=0;
    tick; check_shift(4'b0110,4'b0110,"SHL 1011->0110");
    tick; check_shift(4'b1100,4'b1100,"SHL 0110->1100");
    tick; check_shift(4'b1000,4'b1000,"SHL 1100->1000");
    tick; check_shift(4'b0000,4'b0000,"SHL 1000->0000");

    // Universal mode switch
    $display("\n  --- Phase 7: Universal mode switching ---");
    mode=2'b11; d_in=4'b1010; tick;
    test_num++; if(q_univ===4'b1010) begin
      $display("  PASS [%2d] UNIV LOAD 1010",test_num); pass_cnt++; end
    else begin $display("  FAIL UNIV LOAD"); fail_cnt++; end
    mode=2'b00; tick;
    test_num++; if(q_univ===4'b1010) begin
      $display("  PASS [%2d] UNIV HOLD",test_num); pass_cnt++; end
    else begin $display("  FAIL UNIV HOLD"); fail_cnt++; end
    mode=2'b01; ser_in=1; tick;
    test_num++; if(q_univ===4'b1101) begin
      $display("  PASS [%2d] UNIV SHR 1010->1101",test_num); pass_cnt++; end
    else begin $display("  FAIL UNIV SHR"); fail_cnt++; end
    mode=2'b10; ser_in=0; tick;
    test_num++; if(q_univ===4'b1010) begin
      $display("  PASS [%2d] UNIV SHL 1101->1010",test_num); pass_cnt++; end
    else begin $display("  FAIL UNIV SHL"); fail_cnt++; end

    $display("\n======================================================");
    $display("  RESULTS: %0d / %0d PASS  |  %0d FAIL",pass_cnt,test_num,fail_cnt);
    $display("======================================================");
    if(fail_cnt==0) $display("  ALL TESTS PASSED\n");
    else $fatal(1,"  %0d FAILURE(S)\n",fail_cnt);
    #20; $finish;
  end
endmodule
`default_nettype wire

📈 Simulation Waveform

Fig 3 — PIPO waveform: parallel load 1011, then shift-right 4 cycles with ser_in=0
clk rst_n load d_in q[3:0] ser_in ser_out 0 1 2 3 4 5 6 0 1 0 1 0 xxxx 1011 —- 0000 1011 0101 0010 0001 0000 0 0 1 1 0 1 LOAD SHR SHR SHR SHR

💻 Simulation Console Output

====================================================== PIPO Shift Register Testbench ====================================================== — Phase 1: Reset — PASS [ 1] Reset -> q=0000 | q=0000 — Phase 2: Parallel Load — PASS [ 2] Load 1011 | q=1011 PASS [ 3] Load 1100 | q=1100 PASS [ 4] Load 0000 | q=0000 PASS [ 5] Load 1111 | q=1111 — Phase 3: Hold (load=0, shift=0) — PASS [ 6] Hold: q unchanged | q=1111 PASS [ 7] Hold: q unchanged | q=1111 — Phase 4: Right Shift (ser_in=0) — PASS [ 8] SHR 1011->0101 | q_shift=0101 q_univ=0101 PASS [ 9] SHR 0101->0010 | q_shift=0010 q_univ=0010 PASS [10] SHR 0010->0001 | q_shift=0001 q_univ=0001 PASS [11] SHR 0001->0000 | q_shift=0000 q_univ=0000 — Phase 5: Right Shift (ser_in=1) — PASS [12] SHR(ser=1) 0000->1000 | q_shift=1000 q_univ=1000 PASS [13] SHR(ser=1) 1000->1100 | q_shift=1100 q_univ=1100 PASS [14] SHR(ser=1) 1100->1110 | q_shift=1110 q_univ=1110 PASS [15] SHR(ser=1) 1110->1111 | q_shift=1111 q_univ=1111 — Phase 6: Left Shift (ser_in=0) — PASS [16] SHL 1011->0110 | q_shift=0110 q_univ=0110 PASS [17] SHL 0110->1100 | q_shift=1100 q_univ=1100 PASS [18] SHL 1100->1000 | q_shift=1000 q_univ=1000 PASS [19] SHL 1000->0000 | q_shift=0000 q_univ=0000 — Phase 7: Universal mode switching — PASS [20] UNIV LOAD 1010 PASS [21] UNIV HOLD PASS [22] UNIV SHR 1010->1101 PASS [23] UNIV SHL 1101->1010 ====================================================== RESULTS: 23 / 23 PASS | 0 FAIL ====================================================== ALL TESTS PASSED

How to Run

Compile all modules and testbench
# Icarus Verilog
iverilog -o pipo_sim \
    pipo_basic.v pipo_shift.v pipo_bidir.v pipo_universal.v pipo_tb.v
vvp pipo_sim
gtkwave pipo.vcd

# ModelSim
vlog pipo_basic.v pipo_shift.v pipo_bidir.v pipo_universal.v pipo_tb.v
vsim -c pipo_tb -do "run -all; quit -f"

🔬 Design Analysis & Comparison

Shift Register Type Comparison

TypeInput modeOutput modeLatencyTypical use
PIPOParallel (N bits/cycle)Parallel (N bits, same cycle)1 cyclePipeline register, bus latch, data alignment
SISOSerial (1 bit/cycle)Serial (1 bit/cycle)N cyclesDelay line, CDC retiming, serial link buffering
SIPOSerial (1 bit/cycle)Parallel (N bits)N cyclesSerial-to-parallel converter, UART RX, SPI RX
PISOParallel (N bits)Serial (1 bit/cycle)N cyclesParallel-to-serial converter, UART TX, SPI TX

Implementation Feature Matrix

ModuleWidthLoadShift-RShift-LHoldSer I/O
pipo_basic4YesNoNoYesNo
pipo_shift4YesYesNoYesser_in/out
pipo_bidir4YesYesYesYesL+R pins
pipo_universalNYesYesYesYesser_in/out

PIPO as Pipeline Stage

// Two-stage pipeline using PIPO:
// Stage 1 computes, Stage 2 registers
wire [7:0] stage1_out;
reg  [7:0] stage2_reg;

// stage1_out = combinational logic
assign stage1_out = a + b;

// PIPO register captures result each cycle
always @(posedge clk)
  stage2_reg <= stage1_out; // pipo_basic

Universal Register as PISO

// Load word, then shift out serially:
// Acts as PISO (parallel-in serial-out)

// Cycle 0: load=11, d_in=1011
// Cycles 1-4: mode=01 (shift right)
// ser_out sequence: 1, 1, 0, 1 (LSB first)

// This implements UART TX:
// Load byte, shift out LSB-first at baud rate
always @(posedge baud_clk)
  mode <= (load_byte) ? 2'b11 : 2'b01;
PIPO vs a plain register: A plain D register (always @(posedge clk) q <= d;) is equivalent to a PIPO with load permanently tied high. Adding the load control allows the register to hold its current value when load=0, enabling conditional update — the fundamental mechanism behind clock enables, write enables in register files, and pipeline stalls (hazard freezing). Every control register, pipeline stage, and CSR (Control and Status Register) in a processor is a PIPO register with a qualified write enable.
PIPO for data bus retiming: When two logic blocks run at slightly different phases of the same clock (due to clock skew), a PIPO register inserted between them acts as a retiming element. The register captures data at the source clock edge and presents it cleanly to the destination logic. This is the basis of pipeline register insertion in synthesis tools that automatically insert PIPO stages to meet timing constraints.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top