[ Up to the Home Page of CS-534 ]
© copyright University of Crete, Greece.
Dept. of Computer Science, University of Crete.
CS-534: Packet Switch Architecture

3. Buffer Memory Technologies and Architectures


Subsections in the current document:


3.1 On-Chip Memories


On-Chip SRAM

The plots, below, show examples of cost (area, power) and performance (cycle time) for on-chip static RAM blocks, as functions of capacity, number of ports, and port width. These examples are inspired by and representative of various real 0.35-micron CMOS technologies of about 1998, but are not the same with any one such technology.

On-chip SRAM area

On-chip SRAM cycle time and power consumption

Examples of Cost-Performance of On-Chip SRAM Systems


3.2 Off-Chip Memory Technologies


Static RAM (SRAM) Chip Example (1999)

Micron Technology Inc.

Example: Micron's 8 Mbit, pipelined, "zero-bus-turnaround (ZBT)" SRAM.


DRAM Basics: Row Address, Column Address, Precharge

Synchronous Dynamic RAM (SDRAM) Chip Example

Micron Technology Inc.

Example: Micron's 256 Mbit SDRAM's.


Rambus: High-Throughput I/O's for DRAM (

Rambus Inc.

For more information on Rambus, you can start with the Technology Overview (PDF) document from the Development Support - Getting Started section.


3.3 Communicating across Clock Domains and Elastic Buffers


The need for cross-clock-domain communication

Metastability, synchronization delay

Was the signal sampled before or after its change?

Asynchronous sampling of multibit signals (almost impossible)

Elastic Buffer (2-asynchronous-port SRAM)

Empty and Full flags: asynchronous to either clock

Almost-Empty and Almost-Full flags, for higher throughput

Implementing the Almost-Empty and Almost-Full flags


3.4 FIFO Buffer Memories


Single FIFO Queue in a Memory Block: Circular Buffer

Counters and address decoders can be replaced by shift registers

Multiple FIFO Queues with Statically Partitioned Space for each

Multiple FIFO Queues with Shared Space 1: shifting (impractical)

Multiple FIFO Queues with Shared Space 2: linked lists of blocks


3.5 Multi-Queue Data Structures


Multi-queue buffer memory using linked lists of memory blocks

Enqueue/Dequeue: basic cases

Enqueue/Dequeue: exceptional cases

Cost-performance tradeoffs

CORRECTION: line 4 (separate, off-chip Head, Tail, NxtPtr memories): enqueue latency is 3 --not 2. NOTE (b): to achieve peak throughput, by overlapping successive queue operations, the latency of individual operations will often have to be increased.


NxtPtr inside data memory: free block preallocation

Packet size, block size, line rate, queue Op rate

Queue Op rate, free list rate, free list bypass


[ Up to the Home Page of CS-534 ]
© copyright University of Crete, Greece.
Last updated: March 2000, by M. Katevenis.