Verilog Series · Module 08

Gate Level — Arrays, Flip-Flops, Delays, Strengths, Nets — VLSI Trainers
Verilog Series · Module 08

Gate Level — Arrays, Flip-Flops, Delays, Strengths & Nets

Advanced gate-level modelling: arrays of primitive instances, flip-flop design from gates, propagation delays, strength resolution, net types, and complete basic circuit designs.

🔢 Array of Instances of Primitives

When the same gate primitive must be replicated many times — one per bit of a bus — Verilog allows you to declare an array of instances in a single line using a range specifier. This avoids repetitive code and keeps the design concise.

📦
Single Declaration
One line creates N independent gate instances — one for each bit position in the range.
🔌
Bit-Sliced Ports
Each instance connects to one bit of the vector ports — a[0] → instance 0, a[1] → instance 1, etc.
All Primitives Supported
Works with and, or, nand, nor, xor, buf, not, bufif1, and all other primitives.
🔁
Saves Lines
A 32-bit bus that would need 32 individual gate lines is expressed in one. Makes wide datapaths readable.
Fig 1 — Array of instances syntax and expansion
// Without array — repetitive and error-prone for wide buses
and g0 (y[0], a[0], b[0]);
and g1 (y[1], a[1], b[1]);
and g2 (y[2], a[2], b[2]);
and g3 (y[3], a[3], b[3]);  // ... and so on

// ✅ With array of instances — equivalent, one line
and g[3:0] (y[3:0], a[3:0], b[3:0]);

// General syntax:
//   gate_type  instance_name[MSB:LSB] (out_vector, in_vector1, in_vector2);

💡 Array Examples

Fig 2 — 8-bit Bitwise AND, OR, XOR in one line each

Wide bus operations using gate arrays
module bitwise_ops (
  input  [7:0] a, b,
  output [7:0] and_out, or_out, xor_out, not_a
);
  // 8 AND gates — one per bit pair
  and g_and[7:0] (and_out, a, b);

  // 8 OR gates
  or  g_or [7:0] (or_out,  a, b);

  // 8 XOR gates
  xor g_xor[7:0] (xor_out, a, b);

  // 8 inverters (not has single input — last port)
  not g_not[7:0] (not_a,   a);
endmodule

Fig 3 — 8-bit Tri-State Bus Driver

Array of bufif1 — one per bit, all sharing the same enable
module bus_driver_8b (
  input  [7:0] data_in,
  input        oe,        // output enable — active high
  output [7:0] bus
);
  // {8{oe}} replicates the 1-bit enable to match the 8-bit array
  bufif1 g[7:0] (bus, data_in, {8{oe}});
endmodule

Fig 4 — 32-bit Inverter Bank

Replacing 32 not instances with one array declaration
module inv_bank_32 (
  input  [31:0] data_in,
  output [31:0] data_out
);
  // 32 inverters — declared as one array
  not inv[31:0] (data_out, data_in);
endmodule

Fig 5 — Parity Generator using XOR Array

Mix scalar XOR with array — compute byte parity bit
module parity_gen (
  input  [7:0] data,
  output       parity   // 1 = odd number of 1s in data
);
  // Reduction XOR: fold all 8 bits into one parity bit
  wire [6:0] w;
  xor x0       (w[0], data[0], data[1]);
  xor x1       (w[1], data[2], data[3]);
  xor x2       (w[2], data[4], data[5]);
  xor x3       (w[3], data[6], data[7]);
  xor x4       (w[4], w[0],    w[1]);
  xor x5       (w[5], w[2],    w[3]);
  xor x6       (parity, w[4], w[5]);
endmodule
Range direction matters: g[3:0] creates instances g[3], g[2], g[1], g[0] connected to bits [3], [2], [1], [0] of the port vectors respectively. Always match the range direction of your instance array with your port vectors.

🔄 Design of Flip-Flops with Gate Primitives

Flip-flops and latches are the fundamental storage elements in digital design. At the gate level they are built from cross-coupled NAND or NOR gates. Understanding their gate-level construction clarifies why they behave the way they do.

🔁
Cross-Coupled Feedback
The output of each gate feeds back as an input to the other, creating a bistable circuit that can hold one of two stable states.
Setup & Hold
Gate-level models expose propagation delays through the feedback path — the physical origin of setup and hold time requirements.
🧬
Building Blocks
SR latch → D latch → D flip-flop → JK flip-flop. Each is built from the one below it by adding control gates.

🟡 SR Latch (NOR-based)

The SR (Set-Reset) latch is the foundation of all flip-flop circuits. Two cross-coupled NOR gates form a bistable element.

Fig 6 — NOR-based SR Latch: circuit and truth table
NOR1 NOR2 S R Q Qn
SR Latch (NOR) Truth Table
SRQQnState
00holdNo change
1010Set Q=1
0101Reset Q=0
11xxForbidden!
module sr_latch_nor (
  input  s, r,
  output q, qn
);
  // Cross-coupled NOR gates — feedback wires make this a memory element
  nor g1 (q,  s,  qn);  // Q  = ~(S  | Qn)
  nor g2 (qn, r,  q );  // Qn = ~(R  | Q )
endmodule
Forbidden state S=1, R=1: Both outputs are forced to 0 simultaneously, violating Q = ~Qn. When S and R both return to 0, the final state is unpredictable — the latch may settle to either Q=0 or Q=1. This must be prevented in design.

🔵 D Flip-Flop (Gate Level)

A D flip-flop captures its input D on the rising edge of the clock and holds it until the next rising edge. At the gate level it is built from two D latches in a master-slave configuration.

Fig 7 — D Flip-Flop: NAND implementation with async reset
module dff_async_rst (
  input      clk, d, rst_n,   // rst_n active-low async reset
  output     q, qn
);
  wire clkn, s, r, dn;
  wire ms, mr, sq, sqn;   // master stage internals

  // Clock inverter
  not  g0  (clkn, clk);

  // Master latch (transparent when clk=0)
  nand g1  (ms,  d,   clkn, rst_n);
  nand g2  (mr,  ms,  clkn      );
  nand g3  (sq,  ms,  sqn       );
  nand g4  (sqn, mr,  sq        );

  // Slave latch (transparent when clk=1 — captures master)
  nand g5  (s,   sq,  clk,  rst_n);
  nand g6  (r,   sqn, clk       );
  nand g7  (q,   s,   qn        );
  nand g8  (qn,  r,   q         );
endmodule
Fig 8 — Master-Slave D Flip-Flop: two-latch structure
D CLK MASTER D Latch en = ~CLK master_Q SLAVE D Latch en = CLK Q

When CLK=0: Master is transparent (tracks D), Slave is latched (holds Q). When CLK=1: Master latches (holds), Slave is transparent (propagates master’s value to Q). The edge-triggered behaviour emerges from this complementary enable scheme.

🟢 JK Flip-Flop (Gate Level)

The JK flip-flop eliminates the forbidden state of the SR latch. When J=1 and K=1, it toggles the output. It is built by feeding Q and Qn back as guards on the J and K inputs of an SR flip-flop.

Fig 9 — JK Flip-Flop: gate level using NAND SR core
JK Flip-Flop Truth Table
JKQ(next)Action
00QHold
101Set
010Reset
11~QToggle ✓
module jk_ff (
  input  j, k, clk,
  output q, qn
);
  wire s, r, clkn;

  not  g0 (clkn, clk);

  // J input gated with ~K feedback (Qn) — prevents J-K=1,1 conflict
  nand g1 (s,  j, clk, qn);  // S = ~(J & CLK & Qn)
  // K input gated with ~J feedback (Q)
  nand g2 (r,  k, clk, q );  // R = ~(K & CLK & Q)
  // SR NAND latch core
  nand g3 (q,  s, qn      );
  nand g4 (qn, r, q       );
endmodule

Delays

Real gates do not switch instantaneously — signals take time to propagate through transistors. Verilog gate primitives allow you to model these propagation delays directly using the # delay specifier.

📈
Rise Delay
Time for output to transition from 0→1 (or x/z→1). Denoted t_rise.
📉
Fall Delay
Time for output to transition from 1→0 (or x/z→0). Denoted t_fall.
Turn-Off Delay
Time for output to transition to high-impedance z (tri-state gates only). Denoted t_off.
⚙️
Min:Typ:Max
Each delay can be given as min:typ:max to model process variation across slow/typical/fast corners.

📐 Delay Types & Syntax

#5
Single
Same delay for rise, fall, turn-off
#(3,4)
Two Values
rise=3, fall=4
#(2,3,1)
Three Values
rise=2, fall=3, turn-off=1
#(1:2:3)
Min:Typ:Max
process corners
Fig 10 — All delay formats on gate primitives
// Single delay — same for all transitions
and  #5          g1 (y, a, b);   // 5 time units for any transition

// Two delays — (rise, fall)
and  #(3, 4)      g2 (y, a, b);   // rise=3ns, fall=4ns

// Three delays — (rise, fall, turn-off) for tri-state gates
bufif1 #(2, 3, 1) g3 (out, in, en); // rise=2, fall=3, z=1

// Min:Typ:Max — three delay values per transition
nand #(1:2:3, 2:3:4) g4 (y, a, b); // rise: 1-2-3, fall: 2-3-4

// With timescale directive
`timescale 1ns/100ps       // time_unit / time_precision
or #(2.5, 3.0) g5 (y, a, b); // 2.5ns rise, 3.0ns fall

Fig 11 — Delay Waveform: How Output Lags Input

and #(3,5) g1 (y, a, b) — output delayed from input change
a b y time→ 0 10 20 30 40 3ns rise

Inertial vs Transport Delay

⚙️ Inertial Delay (Gate Default)

A pulse shorter than the delay is swallowed — it never appears at the output. Models real gate behaviour where narrow glitches are filtered.

// Pulse < 5ns is absorbed
and #5 g1(y, a, b);
// Default — inertial model

📡 Transport Delay (assign default)

Every transition is delayed by exactly the specified time — no pulses are swallowed. Models wire/interconnect delay.

// All pulses pass through, just delayed
assign #5 y = a & b;
// assign uses transport model
Delays are simulation-only in RTL flows. Gate-level delay annotations (#5) are ignored by synthesis tools — the actual timing is determined by the standard cell library delays after synthesis. Use delays in testbenches and gate-level simulation (with SDF back-annotation), not in RTL.

💪 Strengths & Contention Resolution

When multiple gate outputs drive the same wire simultaneously, Verilog uses a strength resolution algorithm to determine the final value. Every gate output has a drive strength — the stronger source wins the contention.

LevelStrengthCategoryDefault for
7supply Driving supply0 / supply1 nets
6strong Driving All gate primitive outputs & assign
5pull Driving Pull-up / pull-down resistors
4weak Driving Weak drivers (user-specified)
3large Capacitivetrireg (large)
2medium Capacitivetrireg (medium, default)
1small Capacitivetrireg (small)
0highz High-Z Undriven / tri-state off

Contention Resolution Rules

🏆
Stronger Wins
If two drivers have different strengths, the higher-strength driver determines the net value. The weaker driver’s value is overridden.
Equal Strength Conflict
If two drivers of the same strength drive opposite values (0 vs 1), the result is x — unknown. A design error.
🔋
Capacitive Retention
A trireg net retains its last driven value when all drivers go high-Z. The capacitive strength decays over time in real silicon.
🔀
Supply Always Wins
supply0/supply1 at strength 7 overrides any other driver. Models a direct power-rail short circuit.
Fig 12 — Strength specification and resolution examples
// Specify drive strength explicitly: (strength1, strength0)
assign (strong1, strong0) net = 1'b1;   // strength 6
assign (pull1,   pull0)   net = 1'b0;   // strength 5
// Result: net = 1 (strong beats pull)

// Two equal-strength drivers — contention → x
assign (strong1, strong0) net = 1'b0;
assign (strong1, strong0) net = 1'b1;
// Result: net = x (conflict!)

// Pull-up resistor (weak) — overridden by any gate output
assign (weak1, highz0) pullup_net = 1'b1;  // default high
assign                   pullup_net = driven; // strong driver overrides

// trireg — holds value when all drivers go z
trireg cap_node;
bufif1 g1 (cap_node, data, oe);
// When oe=0: cap_node retains last driven value (capacitive hold)

🔌 Net Types

Nets model physical wires that continuously carry values driven by sources. Different net types specify how multiple simultaneous drivers are resolved. Choosing the right net type is essential for correct simulation of wired logic and bus structures.

wire / tri Standard connection

Models a simple wire. With a single driver, straightforward. With multiple drivers, strength resolution is applied. tri is identical to wire but signals tri-state intent to readers.

wire y;
tri  [7:0] bus;
wand / triand Wired-AND resolution

Multiple drivers are AND-resolved: any driver pulling to 0 forces the net to 0, regardless of other drivers. Models open-collector/open-drain bus topologies (e.g., I²C).

wand sda;           // I2C data line
// driver A: 1, driver B: 0 → sda = 0
wor / trior Wired-OR resolution

Multiple drivers are OR-resolved: any driver pulling to 1 forces the net to 1. Models open-emitter bus topologies.

wor int_line;        // interrupt bus
// driver A: 0, driver B: 1 → int_line = 1
trireg Capacitive storage net

Retains its last driven value when all drivers go to z. Models charge storage on a capacitive node. Can be sized (small/medium/large).

trireg         cap;      // medium (default)
trireg (large)  cap_l;
trireg (small)  cap_s;
supply0 / supply1 Power / Ground rails

Permanently tied to logic 0 (GND) or logic 1 (VDD) at the highest drive strength — supply (level 7). Cannot be overridden by any other driver.

supply0 GND;
supply1 VDD;
tri0 / tri1 Default pull to 0 or 1

When undriven, tri0 resolves to 0 and tri1 resolves to 1 (at pull strength). Useful for resistor pull-up/down modelling.

tri1 pullup_line;    // high when floating
tri0 pulldown_line;  // low when floating

Net Type Resolution Summary

Net Type0 vs 1 (same strength)Undriven valueTypical use
wire / tri x (conflict) z General connections
wand / triand 0 (AND wins) z Open-drain / I²C bus
wor / trior 1 (OR wins) z Open-emitter / interrupt
trireg x (conflict) last value (cap)Dynamic logic, charge storage
supply0 always 0 0 GND net
supply1 always 1 1 VDD net
tri0 x (conflict) 0 (pull) Pull-down resistor
tri1 x (conflict) 1 (pull) Pull-up resistor

🏗 Design of Basic Circuits

Putting it all together — complete gate-level designs of fundamental digital building blocks, each combining arrays, delays, and the appropriate net types.

Fig 13 — 4-to-1 Multiplexer

4-to-1 MUX using AND-OR-NOT — y = selected input based on sel[1:0]
module mux_4to1 (
  input      in0, in1, in2, in3,
  input [1:0] sel,
  output     y
);
  wire s0n, s1n;        // inverted select lines
  wire a0, a1, a2, a3; // AND gate outputs

  // Invert select lines
  not  g_s0n  (s0n, sel[0]);
  not  g_s1n  (s1n, sel[1]);

  // AND each input with its select decode
  and  g_a0   (a0, in0, s0n,    s1n    ); // sel=00
  and  g_a1   (a1, in1, sel[0], s1n    ); // sel=01
  and  g_a2   (a2, in2, s0n,    sel[1]); // sel=10
  and  g_a3   (a3, in3, sel[0], sel[1]); // sel=11

  // OR all AND outputs together
  or   g_or   (y,  a0, a1, a2, a3);
endmodule

Fig 14 — 2-to-4 Decoder

Decoder: exactly one output goes high based on 2-bit input
module decoder_2to4 (
  input  [1:0] in,
  input        en,         // enable — active high
  output [3:0] out
);
  wire in0n, in1n;

  not  g0  (in0n, in[0]);
  not  g1  (in1n, in[1]);

  and  g_0 (out[0], en, in0n,    in1n    ); // in=00
  and  g_1 (out[1], en, in[0], in1n    ); // in=01
  and  g_2 (out[2], en, in0n,    in[1]); // in=10
  and  g_3 (out[3], en, in[0], in[1]); // in=11
endmodule

Fig 15 — 8-bit Comparator (A == B)

Equality comparator: XNOR all bit pairs, AND the results
module comparator_8b (
  input  [7:0] a, b,
  output       eq   // eq=1 when a==b
);
  wire [7:0] bit_eq;

  // XNOR each bit pair — bit_eq[i]=1 when a[i]==b[i]
  xnor g_xnor[7:0] (bit_eq, a, b);

  // AND all bit_eq together — eq=1 only if all bits match
  and  g_and        (eq, bit_eq[7], bit_eq[6], bit_eq[5],
                         bit_eq[4], bit_eq[3], bit_eq[2],
                         bit_eq[1], bit_eq[0]);
endmodule

Fig 16 — Open-Drain I²C Bus using wand

Multiple masters on I²C SDA line — any device can pull low
module i2c_bus_model (
  input       sda_a, sda_b, sda_c,  // device outputs
  input       oe_a,  oe_b,  oe_c,   // output enables
  output      sda_bus
);
  // wand models open-drain: any 0 driver pulls the bus low
  wand sda_bus;

  // tri1: bus is high when no device drives (pulled up by resistor)
  tri1 pullup_sda;
  buf  g_pu (sda_bus, pullup_sda);

  // Each device drives through a tri-state buffer
  bufif1 g_a (sda_bus, sda_a, oe_a);
  bufif1 g_b (sda_bus, sda_b, oe_b);
  bufif1 g_c (sda_bus, sda_c, oe_c);
endmodule

Fig 17 — 4-bit Shift Register (Gate Level D Flip-Flops)

Chain four DFFs — data shifts right on every rising clock edge
module shift_reg_4bit (
  input        clk, rst_n, serial_in,
  output [3:0] q
);
  // Each DFF output feeds the next DFF's input
  wire [3:0] qn;   // complementary outputs (unused externally)

  // Instantiate 4 D flip-flops in series
  dff_async_rst ff0 (.clk(clk), .rst_n(rst_n), .d(serial_in), .q(q[0]), .qn(qn[0]));
  dff_async_rst ff1 (.clk(clk), .rst_n(rst_n), .d(q[0]),      .q(q[1]), .qn(qn[1]));
  dff_async_rst ff2 (.clk(clk), .rst_n(rst_n), .d(q[1]),      .q(q[2]), .qn(qn[2]));
  dff_async_rst ff3 (.clk(clk), .rst_n(rst_n), .d(q[2]),      .q(q[3]), .qn(qn[3]));
endmodule
Bottom-up design flow: All the circuits above follow the same pattern — define primitive building blocks (gates, flip-flops), then compose them into larger systems. This is the essence of hierarchical structural modelling. Each module can be independently verified before integration, which is why gate-level simulation is so reliable.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top