How the CPU executes instructions one at a time, what an interrupt is and why it dramatically improves efficiency, instruction cycle state diagrams with and without interrupts, and how multiple interrupts are handled through sequential and nested schemes.
A CPU executes programs by repeating one elementary operation over and over, billions of times per second. That operation is called the instruction cycle β the complete sequence of steps required to fetch one instruction from memory and execute it.
In its simplest form, the instruction cycle has exactly two steps:
The CPU reads the next instruction from the memory location pointed to by the Program Counter (PC). The instruction is loaded into the Instruction Register (IR). PC is incremented to point to the following instruction.
The CPU interprets the instruction in IR and performs the required operation β an arithmetic calculation, a memory read/write, a branch, or an I/O command. This may itself involve multiple sub-steps.
This two-step loop repeats continuously until the machine is turned off, a halt instruction is encountered, or an unrecoverable error occurs.
The fetch cycle involves a precise sequence of register transfers. At the micro-architecture level, these are the steps:
| Register | Name | Role in fetch cycle |
|---|---|---|
| PC | Program Counter | Holds the address of the next instruction. Updated after every fetch. |
| MAR | Memory Address Register | Holds the address sent out on the address bus. Loaded from PC at start of fetch. |
| MBR | Memory Buffer Register | Receives the instruction word returned from memory on the data bus. |
| IR | Instruction Register | Holds the current instruction being decoded and executed. Loaded from MBR. |
The execute cycle’s behaviour depends entirely on which instruction is in IR. The four broad categories of operations:
| Execute cycle category | What happens | Example instruction |
|---|---|---|
| ProcessorβMemory | Data transferred between CPU and main memory. MAR carries the address; MBR carries the data. | LOAD R1, [0x100] β read memory address 0x100 into register R1 |
| ProcessorβI/O | Data transferred between CPU and an I/O module. Address identifies the I/O port; data moves via MBR. | OUT 0x3F8, R2 β write R2 to UART port 0x3F8 |
| Data Processing | Arithmetic or logical operation on register values. ALU performs the operation; result and flags updated. | ADD R1, R2, R3 β R1 β R2 + R3; update N, Z, C, V flags |
| Control | Alters the sequential flow of execution by writing a new value to PC. | JMP 0x200 β PC β 0x200, next fetch comes from 0x200 |
Instruction cycle without interrupts. The CPU loops continuously: Fetch β Decode β Execute β Fetch β β¦ until a HLT instruction or unrecoverable error occurs. At 3 GHz with average CPI=1, this loop runs 3 billion times per second.
Consider what happens without interrupts when the CPU needs to print a string to a printer:
This approach β polling (or programmed I/O) β wastes the CPU. For every useful instruction executed to set up the print, a million instruction cycles are consumed in an idle loop checking whether the printer is ready.
Without interrupts (top): the CPU polls and wastes millions of cycles between each I/O operation. With interrupts (bottom): the CPU initiates I/O, then continues executing useful work. When the I/O device asserts IRQ, the CPU briefly runs the ISR (purple), then resumes. I/O and CPU activity now overlap.
Interrupts arise from different sources, classified into four main categories:
| Class | Source | Examples | Typical handling |
|---|---|---|---|
| Program | CPU executing an instruction | Arithmetic overflow, division by zero, illegal instruction, memory protection violation | OS exception handler β may terminate the process or deliver a signal (e.g. SIGFPE) |
| Timer | Hardware timer inside the CPU or on the board | OS scheduler tick (every 1β10 ms), watchdog timer expiry, real-time clock | OS scheduler β preempts current thread, runs next in queue (enables multitasking) |
| I/O | I/O controller (printer, disk, network, keyboard) | Printer ready for next character, disk sector read complete, keyboard key pressed, Ethernet packet arrived | Device ISR β reads/writes data, may wake a blocked process, acknowledges the device |
| Hardware Failure | System hardware detecting a fault | Power failure (NMI), memory parity error, bus error, temperature warning | Emergency handler β save state to stable storage, attempt graceful shutdown or recovery |
When an interrupt request arrives, the CPU handles it by executing an additional interrupt cycle at the end of the current execute cycle. The steps are:
Scenario: User program is executing at PC=0x1050, R1=42, R2=17, FLAGS=0x04. An IRQ from the UART controller fires.
CPU interrupt cycle actions:
PUSH PC β stack: saves 0x1050
PUSH PSW β stack: saves FLAGS=0x04 (and mode bits)
PC β IRQ_vector[UART] = 0x2000 (loaded from interrupt vector table)
[interrupts may be auto-disabled during ISR execution]
ISR at 0x2000 runs: reads UART data register, stores byte, clears interrupt flag in UART controller.
At ISR end β RTI instruction:
POP PSW β FLAGS restored to 0x04
POP PC β PC restored to 0x1050
[interrupts re-enabled]
Result: User program resumes at 0x1050 with R1=42, R2=17, FLAGS=0x04 β exactly as if no interrupt had occurred.
Instruction cycle with interrupt cycle. After every Execute state, the CPU checks for pending interrupt requests (IRQ? diamond). If none, it loops back to Fetch as before. If an IRQ is pending and interrupts are enabled, the CPU enters the Interrupt state β saves context, runs the ISR, restores context β then returns to Fetch.
Transfer of control timeline. While the user program runs instructions A3βA5, the I/O device operates concurrently. When the device asserts IRQ, the CPU completes its current instruction then enters the interrupt cycle. The ISR runs, services the device, and RTI restores context. The user program resumes at A6 with no awareness of the interruption.
What happens when a second interrupt arrives while the CPU is already handling a first interrupt? There are two approaches:
Sequential interrupts (top): IRQ Y arrives during ISR X but must wait until ISR X completes and re-enables interrupts. Nested interrupts (bottom): IRQ Y (high priority) immediately preempts ISR X. ISR Y runs, returns, then ISR X resumes. The stack holds nested context frames β this is why the processor stack is critical hardware infrastructure, not just a software abstraction.
Every ARM-based SoC (Cortex-A smartphone processor, Cortex-M microcontroller, server CPU) implements the Generic Interrupt Controller (GIC) specification. The GICv3/v4 supports up to 1020 shared peripheral interrupts (SPIs), 16 software-generated interrupts (SGIs) per CPU core, and per-core private peripheral interrupts (PPIs). Each interrupt has a configurable priority (0β255), edge or level trigger type, and a target CPU affinity mask. When you work on SoC integration or verification, you configure the GIC register map, verify that each peripheral’s IRQ line reaches the correct GIC SPI input, and write testcases that fire interrupts and confirm the correct ISR vector is invoked.
The FetchβDecodeβExecute cycle maps directly to pipeline stages in a modern CPU. A 5-stage pipeline (Fetch, Decode, Execute, Memory, Writeback) runs five instructions simultaneously β one in each stage. Interrupts in a pipelined CPU must flush in-flight instructions from the pipeline before entering the ISR β the pipeline drain is the hardware equivalent of completing the current instruction before taking an interrupt, which is exactly what the state diagram shows.
This article introduced interrupt-driven I/O as an improvement over polling. But even interrupt-driven I/O requires the CPU to execute ISR code for every chunk of data transferred. The next step, covered in CA-09, is DMA (Direct Memory Access): a dedicated DMA controller takes over the data transfer entirely, moving data from I/O device to memory without CPU involvement. The CPU only receives a single interrupt when the entire transfer is complete.