Verilog Series ยท Module 05

Language Constructs and Conventions โ€” VLSI Trainers
Verilog Series ยท Module 05

Language Constructs & Conventions

The building blocks of every Verilog source file โ€” tokens, keywords, identifiers, white space, and comments explained clearly with examples.

๐Ÿ“– Introduction โ€” What are Tokens?

A Verilog source file is read by the compiler character by character. The compiler groups those characters into meaningful units called lexical tokens โ€” the smallest building blocks of the language, just as words are the smallest meaningful units in a sentence.

Understanding tokens is important because the Verilog compiler’s first job is lexical analysis โ€” breaking your source file into a stream of tokens before it attempts to understand the grammar of the design.

Fig 1 โ€” How a Verilog source file is processed
Source File .v / .sv Lexer tokenise Parser check grammar Elaboration & Simulation Tokens produced here
White space and comments are stripped out by the lexer and do not become tokens โ€” they are used only to separate tokens and document code.

๐Ÿงฉ The 7 Token Types

Every character in a Verilog source file belongs to one of seven token categories:

Keywords
module, wire
Identifiers
clk, data_out
Numbers
4’b1010, 8’hFF
Strings
“hello”
Operators
&, |, ~, +
Comments
// โ€ฆ /* โ€ฆ */
White Space
space, tab, \n
Fig 2 โ€” Tokens identified in a single line of Verilog
assign   y   =   a   &   b;
โ”€โ”€โ”€โ”€โ”€โ”€   โ”€   โ”€   โ”€   โ”€   โ”€
keyword  id  op  id  op  id
White space between tokens is discarded by the lexer โ€” it only exists to separate tokens.

๐Ÿ”‘ Keywords

Keywords are reserved words that have a fixed, predefined meaning in the Verilog language. They form the vocabulary of the language itself โ€” the compiler assigns special significance to each one.

๐Ÿ”’
Reserved: Keywords cannot be used as identifiers (signal names, module names, instance names). Doing so causes a compile error.
๐Ÿ”ก
Always lowercase: All Verilog keywords are defined in lowercase only. MODULE and Module are not keywords โ€” they would be treated as identifiers (though using them is confusing and discouraged).
๐ŸŽจ
Highlighted by editors: Most IDEs and text editors (VS Code, Vim, Emacs) color-code keywords automatically to help readability.

๐Ÿ“‚ Keyword Categories

Verilog keywords can be grouped by purpose. Here is the complete set organized into four categories:

Module & Port Declarations blue

module endmodule input output inout wire reg integer real time parameter localparam signed supply0 supply1 tri wand wor trireg assign

Control Flow purple

if else case casex casez endcase for while repeat forever disable begin end fork join generate endgenerate genvar

Procedural Blocks green

always initial posedge negedge task endtask function endfunction automatic force release deassign

Gate Primitives orange

and nand or nor xor xnor not buf bufif0 bufif1 notif0 notif1 nmos pmos cmos tran tranif0 tranif1
Fig 3 โ€” Keywords in context (each category highlighted)
module counter (            // โ† declaration keyword
  input      clk, rst,       // โ† port keyword
  output reg [3:0] count    // โ† declaration keywords
);
  always @(posedge clk) begin  // โ† procedural keywords
    if (rst)               // โ† control flow keyword
      count <= 4'd0;
    else                   // โ† control flow keyword
      count <= count + 1;
  end                      // โ† control flow keyword
endmodule                  // โ† declaration keyword
Never use a keyword as an identifier. Writing wire wire; or reg module; will cause a compile error. If you accidentally name a signal after a keyword, rename it (e.g., wire_data, mod_ctrl).

๐Ÿท Identifiers

Identifiers are user-defined names given to modules, ports, signals, instances, tasks, functions, and parameters. They let you refer to design elements by meaningful names rather than anonymous positions.

A good identifier is descriptive, consistent, and unambiguous. Poor naming is one of the most common causes of bugs that are hard to find.

Fig 4 โ€” Where identifiers appear in a design
module uart_tx (         // โ† module name    (identifier)
  input      clk,          // โ† port name       (identifier)
  input      tx_data,      // โ† port name       (identifier)
  output reg tx_out        // โ† port name       (identifier)
);
  parameter BAUD_DIV = 868;  // โ† parameter name  (identifier)
  reg [9:0] shift_reg;        // โ† signal name     (identifier)
  integer   bit_count;        // โ† variable name   (identifier)

  baud_gen u_baud (...);      // โ† module & instance name (identifiers)
endmodule

๐Ÿ“ Identifier Rules

Verilog identifiers must follow strict syntactic rules:

๐Ÿ”ค
Must start with a letter (aโ€“z, Aโ€“Z) or underscore (_).
Digits (0โ€“9) and dollar sign ($) are not allowed as the first character.
โž•
Subsequent characters can be: letters (aโ€“z, Aโ€“Z), digits (0โ€“9), underscore (_), or dollar sign ($).
No spaces, hyphens, or special characters allowed.
๐Ÿ“
No length limit โ€” but practically keep them under 64 characters for readability and tool compatibility.
๐Ÿ”ก
Case sensitive โ€” clk, CLK, and Clk are three completely different identifiers. See the case sensitivity section below.
๐Ÿšซ
Cannot be a keyword โ€” you cannot use wire, module, always, etc. as an identifier.
๐Ÿช™
Escaped identifiers โ€” any sequence of printable characters can be used as an identifier if preceded by a backslash (\) and followed by white space. Example: \add+1 . Rarely used in practice.

Naming Conventions (Best Practices)

Fig 5 โ€” Recommended naming conventions
// Modules          โ€” lowercase with underscores (snake_case)
module uart_tx   fifo_ctrl   axi_master

// Parameters       โ€” UPPER_CASE (screaming snake case)
parameter DATA_WIDTH = 32;
parameter FIFO_DEPTH = 16;

// Clocks           โ€” clk prefix/suffix
clk   clk_50m   sys_clk   axi_clk

// Active-low resets โ€” _n suffix
rst_n   reset_n   arst_n

// Active signals   โ€” descriptive, lowercase
tx_valid   rx_ready   data_in   addr_out

// Instance names   โ€” u_ prefix is common industry convention
u_fifo   u_alu   u_uart   u0   u1

โœ… Valid vs Invalid Identifiers

Identifier Status Reason
clock VALID Starts with a letter, only letters
_reset VALID Starts with underscore โ€” allowed
data_32b VALID Letters, digits, underscores โ€” all fine
tx$valid VALID $ is allowed after the first character
A1_out VALID Starts with letter, alphanumeric rest
1_name INVALID Starts with a digit โ€” not allowed
$signal INVALID $ cannot be the first character
data out INVALID Space splits this into two separate tokens
data-bus INVALID Hyphen is not a valid identifier character
wire INVALID Reserved keyword โ€” cannot be used as an identifier
MODULE VALID Technically valid (not a keyword โ€” keywords are lowercase), but very confusing. Avoid.
Fig 6 โ€” Identifier character rules visualized
First Character aโ€“z Aโ€“Z _ โœ— 0โ€“9 $ space letter or underscore only Subsequent Characters aโ€“z Aโ€“Z 0โ€“9 _ $ โœ— space – + @ # alphanumeric, _, $ allowed โ†’ then

๐Ÿ”ก Case Sensitivity

Verilog is fully case-sensitive, exactly like C. This means the same sequence of letters in different cases are treated as completely independent identifiers by the compiler.

Fig 7 โ€” Three different identifiers, same letters
clk โ‰  CLK โ‰  Clk

These are three entirely different signals as far as Verilog is concerned. If you declare wire clk and then reference CLK, the compiler will report an undeclared identifier error.

wire clk;         // declares a signal named "clk"
wire CLK;         // declares a DIFFERENT signal named "CLK"
wire Clk;         // yet another different signal named "Clk"

assign CLK = clk;  // legal โ€” connecting two distinct signals
Common mistake: Declaring a port as input clk and later writing always @(posedge CLK) โ€” the CLK is undeclared, so the block never triggers. Always use consistent casing. Most teams adopt an all-lowercase-with-underscores convention to avoid this entirely.

Keywords are lowercase only

Since all Verilog keywords are defined in lowercase, writing MODULE, WIRE, or ALWAYS will not be treated as keywords โ€” they will be parsed as identifiers. This is technically valid but extremely confusing and should never be done.

module  my_block (...);  // โœ… correct keyword
MODULE  my_block (...);  // โŒ MODULE is an identifier, not a keyword โ€” compile error

โฃ White Space Characters

White space in Verilog refers to any character that produces blank space in a file but carries no semantic meaning of its own. The Verilog compiler uses white space to separate tokens โ€” once the tokens are identified, all white space is discarded.

โฃ
Space
The regular space character (ASCII 32). Most common separator between tokens on a line.
โ†’
Tab (\t)
Horizontal tab (ASCII 9). Used for indentation. Treated identically to a space as a token separator.
โ†ต
Newline (\n)
Line feed (ASCII 10). Ends a line. Verilog statements can span multiple lines โ€” the newline is just white space.
โ†ก
Form Feed (\f)
Rarely used. Originally caused a page break on printers. Treated as white space by Verilog.
โ
Carriage Return (\r)
Part of Windows line endings (CRLF). Treated as white space. Can cause issues on Linux tools if not handled.
Fig 8 โ€” White space is flexible: all three are equivalent
// Compact โ€” tokens separated by single spaces
assign y = a & b;

// Spread across lines โ€” still the same statement
assign
  y
    =
      a & b;

// Extra spaces โ€” still valid
assign   y   =   a   &   b   ;
White space inside tokens is not allowed. You cannot split a keyword, number, or identifier with a space. 4'b 1010 is a syntax error โ€” the space inside the number literal breaks the token. mo dule is two tokens, not the keyword module.

White Space and Readability

While white space is semantically meaningless to the compiler, it is critically important to human readers. Well-spaced, consistently indented code is easier to read, review, and debug. Most teams enforce a style guide (e.g., 2-space or 4-space indentation, spaces around operators).

Fig 9 โ€” Same logic: unreadable vs readable
// โŒ Hard to read โ€” no spacing or indentation
always@(posedge clk)begin if(rst) q<=0; else q<=d; end

// โœ… Readable โ€” proper spacing and indentation
always @(posedge clk) begin
  if (rst)
    q <= 1'b0;
  else
    q <= d;
end

๐Ÿ’ฌ Comments

Comments are text in the source file that is completely ignored by the compiler. They exist purely for human readers โ€” to explain intent, document assumptions, mark TODOs, or disable code temporarily.

Verilog supports two comment styles:

// Single-Line Comment

  • Begins with //
  • Extends to the end of that line only
  • No closing delimiter needed
  • Most common style for inline documentation

/* Multi-Line Comment */

  • Begins with /* and ends with */
  • Can span any number of lines
  • Cannot be nested inside another /* */
  • Used for file headers and block documentation
Fig 10 โ€” Comment styles with practical usage
// โ”€โ”€โ”€ File Header (multi-line comment) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
/*
 * Module  : uart_tx
 * Author  : VLSI Trainers
 * Date    : 2026-05-14
 * Purpose : Serial UART transmitter, 8N1 format
 * Notes   : Baud rate configured via BAUD_DIV parameter
 */

module uart_tx #(
  parameter BAUD_DIV = 868    // 50MHz / 57600 baud โ‰ˆ 868
) (
  input      clk,              // system clock, 50 MHz
  input      rst_n,            // active-low async reset
  input [7:0] tx_data,          // byte to transmit
  input      tx_start,         // pulse high to begin TX
  output reg tx_out,           // serial output line
  output reg tx_busy           // high while transmitting
);

  // TODO: add parity support (currently 8N1 only)
  // FIXME: baud counter rolls over incorrectly at reset

  always @(posedge clk) begin
    /* Main TX state machine
       States: IDLE โ†’ START โ†’ DATA[0..7] โ†’ STOP */
  end

endmodule

Cannot Nest Multi-Line Comments

Fig 11 โ€” Nested /* */ causes a compile error
/* outer comment start
     /* inner comment โ€” this does NOT work! */
   outer comment end? */   โ† compiler sees end here โ€” rest is code!

// โœ… Correct way to comment out a block containing comments:
// Use // on each line, or use `ifdef 0 ... `endif trick
`ifdef NEVER
  /* old code with comments inside */
  assign x = y;
`endif
Cannot nest /* */ comments. The first */ encountered always ends the comment โ€” even if it was meant to close an inner comment. If you need to comment out a block that already contains /* */ comments, use // on each line instead, or use the `ifdef NEVER ... `endif trick shown above.

Comments vs Compiler Directives

Comments are not the same as compiler directives (sometimes called preprocessor directives). Directives like `timescale, `include, `define, and `ifdef are active instructions to the compiler โ€” they are not ignored. They begin with a backtick (`), not // or /*.

// This is a comment โ€” ignored by the compiler

`timescale 1ns/1ps   // This is a compiler directive โ€” NOT ignored
`define WIDTH 8       // Defines a text macro โ€” NOT ignored
Good commenting practice: Comment the why, not the what. The code itself already shows what is happening. A comment like // increment counter adds nothing. A comment like // saturate at max value to prevent wrap-around explains intent that isn’t obvious from the code.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top