8-bit ALU — Design & Testbench
Complete 8-bit Arithmetic Logic Unit with 8 operations — addition, subtraction, AND, OR, XOR, NOT, shift-left, shift-right — zero, carry, negative and overflow flags, plus an exhaustive self-checking testbench covering all operation codes.
⚙️ Introduction & Theory
An Arithmetic Logic Unit (ALU) is the computational core of every processor. It performs arithmetic operations (addition, subtraction) and logical operations (AND, OR, XOR, NOT, shifts) on two operands, selecting the operation via an operation code (opcode). The ALU also produces status flags that record properties of the result — used by the control unit for branch decisions.
📋 ALU Operations & Flags
Operation Code Table
Status Flags
result == 0
unsigned overflow
result[7] == 1
signed overflow
| Flag | Name | Condition for asserts | Operations that set it |
|---|---|---|---|
| Z (zero) | Zero | result == 8'b0 |
All operations — asserts when result is all zeros |
| C (carry) | Carry / Borrow | ADD: 9th bit of A+B; SUB: borrow; SHL: old MSB; SHR: old LSB | ADD, SUB, SHL, SHR |
| N (negative) | Negative | result[7] == 1 (MSB set = negative in two’s complement) |
All operations |
| V (overflow) | Signed Overflow | ADD: pos+pos=neg or neg+neg=pos; SUB: pos-neg=neg or neg-pos=pos | ADD, SUB only |
V = (A[7] == B[7]) && (result[7] != A[7]). For SUB (A-B), treat it as A + (~B + 1) and apply the same rule. This correctly identifies when a signed result has overflowed the representable range (−128 to +127 for 8-bit two’s complement).
🔌 Block Diagram
🧠 Behavioral Implementation
The behavioral model uses a case statement on the opcode inside an always @(*) block. Each case branch computes the result and the carry flag for its operation. The zero, negative, and overflow flags are derived continuously from the result.
// ============================================================ // Module : alu_8bit // Function : 8-bit ALU — 8 operations, 4 status flags // Inputs : a[7:0], b[7:0] — operands // op[2:0] — operation select // Outputs : result[7:0] — computed output // zero, carry, negative, overflow — status flags // ============================================================ `timescale 1ns/1ps `default_nettype none module alu_8bit ( input [7:0] a, // operand A input [7:0] b, // operand B input [2:0] op, // operation select output reg [7:0] result, // 8-bit result output reg zero, // Z: result is all zeros output reg carry, // C: unsigned carry/borrow/shift-out output reg negative, // N: result MSB (sign bit) output reg overflow // V: signed arithmetic overflow ); // ── Operation select constants ──────────────────────────── localparam [2:0] ADD = 3'b000, SUB = 3'b001, AND = 3'b010, OR = 3'b011, XOR = 3'b100, NOT = 3'b101, SHL = 3'b110, SHR = 3'b111; // ── 9-bit internal result to capture carry ──────────────── reg [8:0] full_result; // bit[8] = carry out always @(*) begin // Default: clear all flags, no carry full_result = 9'b0; carry = 1'b0; overflow = 1'b0; case (op) // ── Arithmetic ───────────────────────────────────────── ADD: begin full_result = {1'b0, a} + {1'b0, b}; carry = full_result[8]; // Signed overflow: same-sign operands produce opposite-sign result overflow = (~a[7] & ~b[7] & full_result[7]) | ( a[7] & b[7] & ~full_result[7]); end SUB: begin full_result = {1'b0, a} - {1'b0, b}; carry = full_result[8]; // borrow flag for SUB // Overflow: pos-neg=neg or neg-pos=pos overflow = (~a[7] & b[7] & full_result[7]) | ( a[7] & ~b[7] & ~full_result[7]); end // ── Logical ──────────────────────────────────────────── AND: full_result[7:0] = a & b; OR: full_result[7:0] = a | b; XOR: full_result[7:0] = a ^ b; NOT: full_result[7:0] = ~a; // B is unused // ── Shift ────────────────────────────────────────────── SHL: begin carry = a[7]; // MSB shifts into carry full_result[7:0] = {a[6:0], 1'b0}; // shift left, fill 0 end SHR: begin carry = a[0]; // LSB shifts into carry full_result[7:0] = {1'b0, a[7:1]}; // shift right, fill 0 end default: full_result = 9'bx; endcase // ── Status flags derived from result ───────────────────── result = full_result[7:0]; zero = (result == 8'b0); // Z: all bits zero? negative = result[7]; // N: sign bit end endmodule `default_nettype wire
⚡ Data Flow Implementation
The data flow model computes all eight operations in parallel using continuous assignments, then selects the active result and flags using a MUX structure driven by the opcode. This style makes the parallel nature of combinational hardware more explicit.
// ============================================================ // Module : alu_8bit_dataflow // Approach : All 8 results computed simultaneously in parallel; // opcode selects which result propagates to output. // ============================================================ `timescale 1ns/1ps `default_nettype none module alu_8bit_dataflow ( input [7:0] a, b, input [2:0] op, output [7:0] result, output zero, carry, negative, overflow ); // ── Compute all operations in parallel ─────────────────── wire [8:0] add_r = {1'b0,a} + {1'b0,b}; wire [8:0] sub_r = {1'b0,a} - {1'b0,b}; wire [7:0] and_r = a & b; wire [7:0] or_r = a | b; wire [7:0] xor_r = a ^ b; wire [7:0] not_r = ~a; wire [7:0] shl_r = {a[6:0], 1'b0}; wire [7:0] shr_r = {1'b0, a[7:1]}; // ── MUX: select result by opcode ───────────────────────── assign result = (op == 3'b000) ? add_r[7:0] : (op == 3'b001) ? sub_r[7:0] : (op == 3'b010) ? and_r : (op == 3'b011) ? or_r : (op == 3'b100) ? xor_r : (op == 3'b101) ? not_r : (op == 3'b110) ? shl_r : shr_r; // ── Carry flag by operation ─────────────────────────────── assign carry = (op == 3'b000) ? add_r[8] : // ADD: carry out (op == 3'b001) ? sub_r[8] : // SUB: borrow (op == 3'b110) ? a[7] : // SHL: old MSB (op == 3'b111) ? a[0] : // SHR: old LSB 1'b0; // AND/OR/XOR/NOT: no carry // ── Overflow: only meaningful for ADD and SUB ──────────── assign overflow = (op == 3'b000) ? ( ~a[7] & ~b[7] & add_r[7]) | ( a[7] & b[7] & ~add_r[7]) : (op == 3'b001) ? (~a[7] & b[7] & sub_r[7]) | ( a[7] & ~b[7] & ~sub_r[7]) : 1'b0; // ── Zero and negative flags ─────────────────────────────── assign zero = (result == 8'b0); assign negative = result[7]; endmodule `default_nettype wire
🧪 Comprehensive Testbench
The testbench exhaustively verifies all 8 operations across carefully chosen operand values — corner cases (0x00, 0xFF, 0x7F, 0x80), overflow-triggering pairs, and random vectors. Both the behavioral and data flow implementations are tested simultaneously against a reference model.
// ============================================================ // Testbench : alu_8bit_tb // DUTs : alu_8bit (behavioral) + alu_8bit_dataflow // Strategy : Per-operation corner cases + random sweep // Checks: result, zero, carry, negative, overflow // ============================================================ `timescale 1ns/1ps `default_nettype none module alu_8bit_tb; // ── Shared stimulus ──────────────────────────────────────── reg [7:0] a, b; reg [2:0] op; // ── Behavioral DUT outputs ───────────────────────────────── wire [7:0] beh_result; wire beh_z, beh_c, beh_n, beh_v; // ── Data flow DUT outputs ────────────────────────────────── wire [7:0] df_result; wire df_z, df_c, df_n, df_v; // ── Instantiate both DUTs ────────────────────────────────── alu_8bit dut_beh ( .a(a), .b(b), .op(op), .result(beh_result), .zero(beh_z), .carry(beh_c), .negative(beh_n), .overflow(beh_v) ); alu_8bit_dataflow dut_df ( .a(a), .b(b), .op(op), .result(df_result), .zero(df_z), .carry(df_c), .negative(df_n), .overflow(df_v) ); // ── VCD dump ─────────────────────────────────────────────── initial begin $dumpfile("alu_8bit.vcd"); $dumpvars(0, alu_8bit_tb); end // ── Test tracking ────────────────────────────────────────── integer pass_cnt=0, fail_cnt=0, test_num=0; integer seed=1337; // ── Expected values (computed by reference model) ────────── reg [7:0] exp_result; reg exp_z, exp_c, exp_n, exp_v; reg [8:0] tmp9; // ── Reference model task — golden values ────────────────── task compute_expected; begin exp_c = 0; exp_v = 0; case (op) 3'b000: begin tmp9 = {1'b0,a}+{1'b0,b}; exp_result=tmp9[7:0]; exp_c=tmp9[8]; exp_v=(~a[7]&~b[7]&exp_result[7])|( a[7]& b[7]&~exp_result[7]); end 3'b001: begin tmp9 = {1'b0,a}-{1'b0,b}; exp_result=tmp9[7:0]; exp_c=tmp9[8]; exp_v=(~a[7]& b[7]&exp_result[7])|( a[7]&~b[7]&~exp_result[7]); end 3'b010: exp_result=a&b; 3'b011: exp_result=a|b; 3'b100: exp_result=a^b; 3'b101: exp_result=~a; 3'b110: begin exp_result={a[6:0],1'b0}; exp_c=a[7]; end 3'b111: begin exp_result={1'b0,a[7:1]}; exp_c=a[0]; end default: exp_result=8'hxx; endcase exp_z = (exp_result==8'b0); exp_n = exp_result[7]; end endtask // ── Self-checking task ───────────────────────────────────── task check; input [7:0] ta, tb; input [2:0] top; begin a=ta; b=tb; op=top; #5; compute_expected; test_num++; if (beh_result===exp_result && beh_z===exp_z && beh_c===exp_c && beh_n===exp_n && beh_v===exp_v && df_result===exp_result && df_z===exp_z && df_c===exp_c && df_n===exp_n && df_v===exp_v) begin $display(" PASS [%3d] op=%3b a=0x%02h b=0x%02h | res=0x%02h Z=%b C=%b N=%b V=%b", test_num,op,a,b,exp_result,exp_z,exp_c,exp_n,exp_v); pass_cnt++; end else begin $display(" FAIL [%3d] op=%3b a=0x%02h b=0x%02h", test_num,op,a,b); $display(" BEH: res=0x%02h Z=%b C=%b N=%b V=%b", beh_result,beh_z,beh_c,beh_n,beh_v); $display(" DF: res=0x%02h Z=%b C=%b N=%b V=%b", df_result,df_z,df_c,df_n,df_v); $display(" EXP: res=0x%02h Z=%b C=%b N=%b V=%b", exp_result,exp_z,exp_c,exp_n,exp_v); fail_cnt++; end #5; end endtask // ── Stimulus ─────────────────────────────────────────────── initial begin $display("\n========================================================"); $display(" 8-bit ALU Testbench — Behavioral vs Data Flow"); $display("========================================================"); // ── ADD: corner cases ───────────────────────────────── $display("\n --- ADD (000) ---"); check(8'h00,8'h00,3'b000); // 0+0=0, Z=1 check(8'h01,8'h01,3'b000); // 1+1=2 check(8'h7F,8'h01,3'b000); // 127+1=128, overflow! (pos+pos=neg) check(8'hFF,8'h01,3'b000); // 255+1=0, carry=1, Z=1 check(8'h80,8'h80,3'b000); // -128+(-128), carry+overflow check(8'hFF,8'hFF,3'b000); // 255+255=510, carry check(8'h55,8'hAA,3'b000); // 85+170=255, no carry // ── SUB: corner cases ───────────────────────────────── $display("\n --- SUB (001) ---"); check(8'h05,8'h03,3'b001); // 5-3=2 check(8'h00,8'h00,3'b001); // 0-0=0, Z=1 check(8'h00,8'h01,3'b001); // 0-1=255, borrow (carry=1), N=1 check(8'h80,8'h01,3'b001); // -128-1 = overflow! (neg-pos=pos) check(8'h7F,8'hFF,3'b001); // 127-(-1)=overflow (pos-neg=neg) check(8'hAA,8'hAA,3'b001); // equal operands → 0, Z=1 // ── AND ─────────────────────────────────────────────── $display("\n --- AND (010) ---"); check(8'hFF,8'h00,3'b010); // FF AND 00 = 00, Z=1 check(8'hFF,8'hFF,3'b010); // FF AND FF = FF check(8'hAA,8'h55,3'b010); // 10101010 AND 01010101 = 0, Z=1 check(8'hF0,8'h0F,3'b010); // complementary halves = 0, Z=1 // ── OR ──────────────────────────────────────────────── $display("\n --- OR (011) ---"); check(8'h00,8'h00,3'b011); // 00 OR 00 = 00, Z=1 check(8'hAA,8'h55,3'b011); // 10101010 OR 01010101 = FF check(8'hF0,8'h0F,3'b011); // complementary → FF, N=1 // ── XOR ─────────────────────────────────────────────── $display("\n --- XOR (100) ---"); check(8'hAA,8'hAA,3'b100); // same operands → 0, Z=1 check(8'hAA,8'h55,3'b100); // complementary → FF check(8'hFF,8'h00,3'b100); // FF XOR 00 = FF, N=1 // ── NOT ─────────────────────────────────────────────── $display("\n --- NOT (101) ---"); check(8'h00,8'h00,3'b101); // NOT 00 = FF, N=1 check(8'hFF,8'h00,3'b101); // NOT FF = 00, Z=1 check(8'hAA,8'h00,3'b101); // NOT AA = 55 check(8'h55,8'h00,3'b101); // NOT 55 = AA, N=1 // ── SHL ─────────────────────────────────────────────── $display("\n --- SHL (110) ---"); check(8'h01,8'h00,3'b110); // 00000001 → 00000010 check(8'h80,8'h00,3'b110); // 10000000 → 00000000, carry=1, Z=1 check(8'hAA,8'h00,3'b110); // 10101010 → 01010100, carry=1 check(8'h40,8'h00,3'b110); // 01000000 → 10000000, N=1 // ── SHR ─────────────────────────────────────────────── $display("\n --- SHR (111) ---"); check(8'h80,8'h00,3'b111); // 10000000 → 01000000 check(8'h01,8'h00,3'b111); // 00000001 → 00000000, carry=1, Z=1 check(8'h55,8'h00,3'b111); // 01010101 → 00101010, carry=1 check(8'hFF,8'h00,3'b111); // 11111111 → 01111111, carry=1 // ── Random sweep: 32 vectors per operation ──────────── $display("\n --- RANDOM VECTORS (32 per op = 256 total) ---"); begin : rand_loop integer i, opi; for (opi=0; opi<8; opi=opi+1) begin for (i=0; i<32; i=i+1) begin check( {$random(seed)} % 256, {$random(seed)} % 256, opi[2:0] ); end end end // ── Summary ──────────────────────────────────────────── $display("\n========================================================"); $display(" RESULTS: %0d / %0d PASS | %0d FAIL", pass_cnt, test_num, fail_cnt); $display("========================================================"); if (fail_cnt==0) $display(" ✅ ALL TESTS PASSED — ALU verified correct\n"); else $fatal(1," ❌ %0d FAILURE(S)\n",fail_cnt); #20; $finish; end // ── Continuous monitor ───────────────────────────────────── initial $monitor(" @%0t op=%3b a=%02h b=%02h → res=%02h Z=%b C=%b N=%b V=%b", $time,op,a,b,beh_result,beh_z,beh_c,beh_n,beh_v); endmodule `default_nettype wire
Test Vector Coverage
| Section | Vectors | What it covers |
|---|---|---|
| ADD corner cases | 7 | Zero result, carry-out, signed overflow (pos+pos=neg, neg+neg=pos), max values |
| SUB corner cases | 6 | Zero result, borrow, signed overflow (neg-pos, pos-neg), equal operands |
| AND corner cases | 4 | Mask to 0 (complementary), mask to all-ones, zero detection |
| OR corner cases | 3 | Zero result, complement-OR to FF, negative flag |
| XOR corner cases | 3 | Same operands → 0, complement → FF, identity with 0 |
| NOT corner cases | 4 | NOT of 0 → FF, NOT of FF → 0, bitwise complement patterns |
| SHL corner cases | 4 | MSB into carry, shift to zero, alternating bit patterns, MSB result |
| SHR corner cases | 4 | MSB fill-0, LSB into carry, shift to zero, all-ones input |
| Random sweep | 256 (32×8) | Broad operand coverage across all 8 operations |
| Total | 291 | Result, zero, carry, negative, overflow flags all checked every test |
📈 Simulation Waveform
The waveform shows one representative test vector per operation, cycling through all 8 opcodes. Carry and overflow flags only assert on specific arithmetic operations; zero asserts whenever the result is all zeros.
💻 Simulation Console Output
How to Run
# ── Icarus Verilog ──────────────────────────────────────────── iverilog -o alu_sim \ alu_8bit.v \ # behavioral alu_8bit_dataflow.v \ # data flow alu_8bit_tb.v # testbench vvp alu_sim gtkwave alu_8bit.vcd # ── ModelSim ────────────────────────────────────────────────── vlog alu_8bit.v alu_8bit_dataflow.v alu_8bit_tb.v vsim -c alu_8bit_tb -do "run -all; quit -f"
🔬 Design Analysis
Implementation Comparison
| Aspect | Behavioral (case) | Data Flow (parallel) |
|---|---|---|
| Style | Procedural — always @(*) | Continuous — all-parallel assign |
| Readability | Excellent — one case per op | Good — explicit parallelism |
| Synthesis output | Identical gate netlist | Identical gate netlist |
| Power (pre-clock-gating) | Same — all paths active | Same — all paths active |
| Simulation style | Sequential evaluation | Event-driven parallel |
| Best for | RTL design, FSMs, control | Teaching hardware parallelism |
Signed Overflow Detection — Visual Explanation
⚠️ ADD Overflow: same-sign inputs, opposite-sign result
// pos + pos = neg (impossible in signed 8-bit) 0x7F + 0x01 = 0x80 // +127 + 1 = -128? V=1! // neg + neg = pos (impossible in signed 8-bit) 0x80 + 0x80 = 0x00 // -128 + -128 = 0? V=1! (carry too) // Detection: // V = (~a[7] & ~b[7] & result[7]) ← pos+pos=neg // | (a[7] & b[7] & ~result[7]) ← neg+neg=pos
🟣 SUB Overflow: operand sign change in result
// pos - neg = neg (should be positive) 0x7F - 0xFF = 0x80 // +127-(-1) → wraps to -128. V=1! // neg - pos = pos (should be negative) 0x80 - 0x01 = 0x7F // -128-1 → wraps to +127. V=1! // Detection: // V = (~a[7] & b[7] & result[7]) ← pos-neg=neg // | (a[7] & ~b[7] & ~result[7]) ← neg-pos=pos
Extending the ALU
[7:0] widths with a parameter WIDTH, and the 8-bit ALU becomes a 16-bit or 32-bit ALU with one line change: parameter WIDTH = 8; then use [WIDTH-1:0] throughout. The overflow detection logic using a[7], b[7], result[7] generalises to a[WIDTH-1], b[WIDTH-1], result[WIDTH-1] — sign bits are always the MSB regardless of width.
