CA-06: Internal Memory โ€” RAM, ROM, DRAM, SRAM, Flash & Cache โ€” VLSI Trainers
VLSI Trainers CA Series ยท 6 / 12
Computer Architecture ยท Article 6 of 12

CA-06: Internal Memory

Memory characteristics, access methods, SRAM vs DRAM, the complete ROM family โ€” mask ROM, PROM, EPROM, EEPROM, Flash โ€” and the memory hierarchy from registers to magnetic disk. How cache fits into the picture and why it exists.

๐Ÿ’พMemory Characteristics Overview

A computer’s memory system is not a single uniform thing โ€” it is a carefully engineered hierarchy of different storage technologies, each making different trade-offs between speed, capacity, cost, and persistence. Understanding these trade-offs is fundamental to understanding why computers are designed the way they are.

Every memory system can be characterised along six dimensions:

CharacteristicOptions
LocationCPU internal (registers), internal/main (RAM, ROM, cache), external/secondary (disk, tape, optical)
CapacityWord size (natural unit of the CPU), number of addressable words or bytes
Unit of TransferWord (for main memory), block (for secondary storage โ€” e.g. 512-byte disk sector or 64-byte cache line)
Access MethodSequential, direct, random, associative โ€” see Section S3
PerformanceAccess time, cycle time, transfer rate
Physical TypeSemiconductor (SRAM, DRAM, Flash), magnetic surface (HDD, tape), optical (CD, DVD, Blu-ray)
Physical CharacteristicsVolatile/non-volatile, erasable/non-erasable

๐Ÿ“Location & Capacity

Memory location

Capacity terminology

๐Ÿ” Worked Example โ€” Address space calculation

Given: A CPU has a 24-bit address bus and 8-bit (byte) addressable memory.

Addressable locations: 2ยฒโด = 16,777,216 = 16 MB

Given: A 16 Mbit DRAM chip, organised as 1M ร— 16-bit words.

Chip capacity: 1,048,576 locations ร— 16 bits = 16,777,216 bits = 16 Mbits โœ“

Address pins needed: logโ‚‚(1,048,576) = 20 address pins. DRAM uses row/column multiplexing โ€” 10 multiplexed pins select row, same 10 pins reused to select column: 2ยนโฐ ร— 2ยนโฐ = 1M locations.

๐Ÿ”Access Methods

The most fundamental distinction between memory types is how data is accessed โ€” how the hardware reaches a specific stored value:

Figure 1 โ€” Four memory access methods compared
Sequential must scan from start โ˜… head Must read A,B,Cโ€ฆ before reaching โ˜… Access time: Variable (worst) Examples: Magnetic tape punched paper tape Direct seek to vicinity then scan Head seeks track, waits for sector Access time: Variable Examples: HDD, optical disc Random any address, any time DATA โ† any row, instantly Access time: Constant (best) Examples: SRAM, DRAM, ROM Flash, registers Associative search by content, not address Compare tag 0x1A simultaneously 0x1A โ†’ โœ“ HIT โ†’ return data 0x2B โ†’ โœ— 0x3C โ†’ โœ— Access time: Constant (parallel) Examples: Cache tag array (CAM) TLB, set-associative cache vlsitrainers.com

Four memory access methods. Sequential (tape) and direct (disk) have variable access times. Random access (DRAM, SRAM) achieves constant access time โ€” any address is reached in the same time. Associative access (CAM) compares a tag against all stored values simultaneously โ€” used in cache tag arrays and TLBs.

โšกPerformance Parameters

ParameterDefinitionApplies toTypical values
Access TimeTime from address presented to data available (RAM) โ€” or time to position read/write head (non-RAM)All memory typesRegisters: <1 ns ยท L1 cache: 1โ€“4 ns ยท DRAM: 50โ€“100 ns ยท SSD: 50โ€“100 ยตs ยท HDD: 5โ€“15 ms
Cycle TimeAccess time + recovery time before next access can begin. For DRAM: includes precharge and refresh overhead.Primarily random-access memoryDRAM cycle time โ‰ˆ 2ร— access time due to destructive read + restore
Transfer RateRate at which data moves in or out. For RAM: 1/Cycle Time. For non-RAM: TN = TA + N/R where N=bits, R=bit rateAll memory typesDDR5-6400: ~51 GB/s ยท SATA SSD: ~500 MB/s ยท HDD: ~200 MB/s
Why cycle time > access time for DRAM: DRAM cells store charge on capacitors. A read is destructive โ€” the act of sensing the charge partially drains the capacitor. After every read, the controller must restore the charge. This restoration phase (precharge + restore) adds time after access before the next access can begin. SRAM uses flip-flops (non-destructive read) and has no restore overhead โ€” cycle time โ‰ˆ access time.

๐Ÿ”‹Physical Types & Volatility

PropertyVolatileNon-Volatile
DefinitionInformation is lost when power is removedInformation persists without power
Semiconductor examplesSRAM, DRAMROM, PROM, EPROM, EEPROM, Flash, FeRAM
Other technology examplesโ€”Magnetic tape, HDD, optical disc
Typical useWorking memory (programs, data during execution)Storage (firmware, OS, files, configuration)
PropertyErasableNon-Erasable
DefinitionContents can be modified after manufactureContents are fixed at manufacture โ€” cannot be changed
ExamplesSRAM, DRAM, EEPROM, FlashMask ROM (traditional)
ImplicationSoftware can be updated, bugs fixed in fieldAny change requires replacing the chip

๐Ÿ”ฒSemiconductor Memory โ€” RAM

Despite the name, RAM (Random Access Memory) is used specifically to mean read-write, volatile semiconductor memory. Two types exist:

SRAM โ€” Static RAM
  • Each bit stored as a bistable flip-flop (cross-coupled inverters)
  • Data held as long as power is applied โ€” no refresh needed
  • Read is non-destructive โ€” flip-flop state unchanged after read
  • Faster access time (1โ€“10 ns)
  • Larger cell area โ€” 6 transistors per bit
  • More expensive per bit
  • Lower density โ†’ used for small, fast cache memories
  • Digital storage (flip-flop is binary โ€” cleanly 0 or 1)
DRAM โ€” Dynamic RAM
  • Each bit stored as charge on a capacitor
  • Capacitor charge leaks โ†’ must be refreshed every ~64 ms
  • Read is destructive โ€” capacitor partially discharged by sensing
  • Slower access (50โ€“100 ns) and requires refresh overhead
  • Smaller cell area โ€” 1 transistor + 1 capacitor per bit
  • Less expensive per bit โ†’ much higher density
  • Used for large main memory (GBs)
  • Analogue storage (charge level determines 0 or 1)

โš–๏ธSRAM vs DRAM โ€” Side-by-Side

Figure 2 โ€” SRAM cell (6T flip-flop) vs DRAM cell (1T1C capacitor)
SRAM Cell โ€” 6 Transistors (6T) Q Qฬ„ WL Word Line BL BLฬ„ Key properties โœ“ No refresh needed ยท Non-destructive read โœ“ Fast (1โ€“10 ns) ยท Digital (bistable states) โœ— 6 transistors/bit โ†’ larger, more expensive โ†’ Used for: L1/L2/L3 cache, register files DRAM Cell โ€” 1 Transistor + 1 Capacitor (1T1C) NMOS Word Line Bit Line charge Q GND โ† charge leaks must refresh every ~64 ms Key properties โœ“ 1T1C โ†’ smallest cell, highest density, cheapest โœ— Refresh needed ยท Destructive read ยท Slower ยท Analogue vlsitrainers.com

SRAM cell (left): 6 transistors forming two cross-coupled inverters. Bistable โ€” holds state indefinitely without refresh. BL and BLฬ„ are complementary bit lines; Word Line (WL) activates the access transistors for read/write. DRAM cell (right): 1 transistor + 1 capacitor. Charge on the capacitor represents the bit. Leakage requires periodic refresh. Read is destructive โ€” charge must be restored after sensing.

FeatureSRAMDRAM
Storage elementFlip-flop (cross-coupled inverters)Capacitor + access transistor
Transistors per bit61 (+ 1 capacitor)
Refresh required?NoYes โ€” every ~64 ms
Read destructive?NoYes โ€” capacitor partially drained
Access time1โ€“10 ns (faster)50โ€“100 ns (slower)
DensityLow (large cell)High (small cell)
Cost per bitHighLow
PowerLow (static dissipation)Higher (refresh + switching)
Primary useCache (L1/L2/L3), register filesMain memory (GB-scale)

๐Ÿ“–ROM Family โ€” Read-Only Memory Types

ROM (Read-Only Memory) is non-volatile semiconductor memory. The term covers a family of technologies ranging from mask-programmed (at manufacture) to electrically rewritable:

Figure 3 โ€” ROM family: programming method, erase method, and typical use
Non-Volatile Semiconductor Memory Mask ROM Programmed at manufacture Not erasable ยท Cheapest for mass PROM Programmable (once, by user) Fuses/anti-fuses blown ยท Irreversible EPROM Erasable by UV light (โ‰ฅ20 min) Quartz window on chip package EEPROM / Flash Electrically erasable ยท Byte or block-level erase ยท In-system Mask ROM uses ยท BIOS (legacy) ยท Game cartridges ยท Microcode in processors ยท Factory characterisation tables ยท High-volume, fixed-content data PROM uses ยท Small-batch firmware ยท Custom PLDs (now replaced by Flash/EEPROM) โš  One-time programmable! Mistakes are permanent EPROM uses ยท Development/prototyping ยท Firmware testing ยท Re-erasable up to ~1000 cycles โœ“ Reusable (UV erase) โœ— Off-board UV process needed Flash / EEPROM uses ยท UEFI/BIOS (modern) ยท SSD storage (NAND Flash) ยท USB drives, SD cards ยท Microcontroller program memory โœ“ In-system update โ€” OTA firmware vlsitrainers.com

ROM family tree. All four types are non-volatile. Mask ROM is cheapest for high-volume but cannot be changed. PROM can be programmed once by the user. EPROM can be erased by UV light and reprogrammed. EEPROM/Flash can be erased and reprogrammed electrically, in-system โ€” enabling firmware updates without removing the chip.

TypeCategoryProgrammed byErase methodVolatile?
RAMRead-writeCPU (byte-level, in-system)Electrically (byte-level)Yes
Mask ROMRead-onlyPhotolithography masks at fabNot possibleNo
PROMRead-only after programmingPROM programmer (one time)Not possible (fuse)No
EPROMRead-mostlyPROM programmer, multiple timesUV light (~20 min, whole chip)No
FlashRead-mostlyCPU / programmer (block-level)Electrically (block/sector)No
EEPROMRead-mostlyCPU / programmer (byte-level)Electrically (byte-level)No
Flash vs EEPROM: Flash erases in blocks (sectors of 4 KBโ€“64 KB) and is much denser and cheaper than EEPROM. EEPROM erases at individual byte level โ€” more flexible but less dense. Modern microcontrollers (STM32, nRF52) use Flash for program storage (hundreds of KB to MBs) and EEPROM emulation in Flash for configuration data.

โšกAdvanced DRAM โ€” SDRAM, DDR, Burst Mode

Basic DRAM is asynchronous. Modern systems use Synchronous DRAM (SDRAM), which synchronises all operations to the system clock, enabling predictable timing and burst transfers:

DRAM typeKey advancementTransfer rate
Basic DRAMAsynchronous; CPU stalls waiting for data; no burst mode~100 MB/s
SDRAMSynchronised to system clock; CPU knows when data arrives; burst mode~800 MB/s (PC100)
DDR SDRAMDouble Data Rate โ€” transfers on both rising and falling clock edges~1.6 GB/s (DDR-200)
DDR4Lower voltage (1.2V), higher density, higher frequency~17โ€“25 GB/s
DDR564-bit channel split into two 32-bit sub-channels; on-die ECC; 1.1V~25โ€“51 GB/s
Burst mode: Once an initial address is presented, SDRAM can automatically increment the address and output consecutive words without new address cycles. A cache line fill (64 bytes = 8 ร— 64-bit words) with burst mode requires 1 address cycle + 8 data cycles = 9 cycles vs 16 cycles without burst โ€” a 44% reduction.

๐Ÿ”๏ธMemory Hierarchy & Cache

No single memory technology satisfies all requirements simultaneously. The solution is a memory hierarchy โ€” multiple levels of storage, each faster but smaller and more expensive than the level below:

Figure 4 โ€” Memory hierarchy pyramid: speed, cost, and capacity
Registers L1 Cache (SRAM) L2 Cache (SRAM) L3 Cache (SRAM, shared) Main Memory (DRAM) SSD / Flash Storage Magnetic Disk / Tape (secondary storage) Access time <1 ns 1โ€“4 ns 4โ€“10 ns 10โ€“40 ns 50โ€“100 ns 50โ€“150 ยตs 5โ€“15 ms Capacity ~KB 32โ€“64 KB 256โ€“512 KB 4โ€“32 MB 4โ€“128 GB 256 GBโ€“4 TB 1 TBโ€“PB vlsitrainers.com

Memory hierarchy pyramid. Speed decreases and capacity increases from top to bottom. The cache hierarchy bridges the gap โ€” SRAM caches hold recently used data, so most accesses hit cache (fast) rather than going to DRAM (slow). A typical L1 cache hit rate is 90โ€“99%.

Why cache works โ€” locality of reference

Cache line fetches exploit spatial locality by fetching 64 bytes at once even though only 4 bytes were requested โ€” the adjacent bytes will likely be needed next.

๐Ÿ›ก๏ธError Detection & Correction

Memory errors occur in two forms:

Hamming Error Correcting Code (ECC): Adds redundant check bits alongside data bits. For 8-bit data, 4 check bits allow: (1) detection of all 1-bit and 2-bit errors, (2) correction of any single-bit error. Modern DDR5 adds 8 ECC bits per 64-bit data word. ECC RAM is mandatory in servers, workstations, and safety-critical embedded systems.
๐Ÿ” Worked Example โ€” Hamming distance and error detection

Principle: Hamming distance = number of bit positions where two codewords differ. To detect d errors, need Hamming distance โ‰ฅ d+1. To correct d errors, need distance โ‰ฅ 2d+1.

For SECDED (Single-Error Correction, Double-Error Detection): Need Hamming distance = 4. For 64 data bits, SECDED requires 8 check bits (72 bits total stored). DDR5 uses this exact scheme.

Check bit positions: Check bits occupy positions that are powers of 2 (1, 2, 4, 8, 16, 32, 64โ€ฆ). On a mismatch, the syndrome (XOR of failing check bits) gives the exact position of the erroneous bit, enabling correction.

๐Ÿ”ฌVLSI Connections

๐Ÿ”ฌ SRAM macros โ€” the most common cell in every SoC

Every SoC contains dozens to hundreds of SRAM macros โ€” register files, L1 instruction and data caches, L2 caches, shared L3 cache, TLB arrays, FIFO buffers, scratchpad memories. SRAM macros are generated by memory compilers (ARM SRAM Compiler, Faraday Memory, TSMC SRAM) that take capacity, word width, and read/write ports as inputs and produce verified GDS2, LEF, timing characterisation (.lib), and simulation models. During physical design, SRAM macros are hard IP blocks โ€” their internal layout is fixed, and your job is to place them, manage power straps, and close timing on their input/output ports.

๐Ÿ”ฌ Flash in SoC โ€” NOR Flash for code, NAND Flash for data

Embedded microcontrollers (ARM Cortex-M series) integrate NOR Flash directly on-die for program storage. NOR Flash allows random-access reads at byte granularity โ€” the CPU can execute code directly from NOR Flash (XIP โ€” execute-in-place) without copying to RAM first. NAND Flash (the technology in SSDs and USB drives) only supports page-level random access reads and must be copied to RAM before execution, but achieves 10โ€“100ร— higher density than NOR. When you do SoC integration on a Cortex-M design, you will connect the embedded NOR Flash macro to the instruction bus and the DRAM or SRAM to the data bus โ€” a modified Harvard architecture in hardware.

๐Ÿ”ฌ ECC in VLSI โ€” mandatory for safety-critical and server silicon

Every SRAM macro in a safety-critical SoC (automotive ASIL-D, avionics DO-254 Level A) must have ECC. The synthesised ECC logic (Hamming encoder on write, decoder + corrector on read) adds area โ€” typically 12.5% overhead (8 ECC bits per 64 data bits). Automotive SoC designs for ASIL-D go further: SRAM is either implemented with ECC or with lockstep redundancy (two copies of the compute hardware running in parallel, outputs compared every cycle). ISO 26262 mandates ECC or lockstep for safety integrity.

Summary โ€” CA-06 key points: Memory systems are characterised by location, capacity, unit of transfer, access method, performance, physical type, and volatility. Four access methods: sequential (tape), direct (disk), random (DRAM/SRAM โ€” constant time), associative (CAM โ€” parallel content search). Three performance parameters: access time, cycle time (โ‰ฅ access time for DRAM), transfer rate. SRAM (6T flip-flop, fast, non-destructive, no refresh, expensive) is used for cache. DRAM (1T1C capacitor, slow, destructive read, needs refresh, cheap) is used for main memory. ROM family: Mask ROM (manufacture), PROM (one-time user), EPROM (UV erase), EEPROM (byte-level electrical erase), Flash (block-level electrical erase). Memory hierarchy exploits locality of reference. ECC detects and corrects single-bit errors using Hamming codes โ€” mandatory in servers and safety-critical systems.
Interrupts & Instruction Cycle โ˜ฐ CA Series Index Cache & Virtual Memory
Scroll to Top