

#### UNIVERSITAT POLITÈCNICA DE CATALUNYA BARCELONATECH

#### **Processor Design**

José María Arnau

# **Technology Scaling**

- Technology shrinks 0.7x per generation
  - 2x more functions per chip on each new generation
- However, engineering population does not double every two years
- We need better and more efficient design techniques
  - Take advantage of abstraction levels

#### **Design Abstraction Levels**



# **VLSI Design Tools**

- VLSI system design is a complex process
  - Requires automatization of the synthesis cycle
- EDA tools:
  - Synthesis
    - Convert HDL to phyisical representation
  - Analysis and Verification
    - Guarantee correctness of the design
  - Testing
    - Check errors in the fabrication process

# **VLSI Design Tools**

- Synopsys
  - Design Compiler, IC Compiler, VCS...
- Cadence
  - Virtuoso, Incisive, SoC encounter...
- Mentor
  - Questa/ModelSim, Calibre, Nitro-SoC...
- Qflow (Free and Open Source)
  - Yosys, Graywolf, QRouter....

#### Design Domains (Gajski i Kuhn)



#### **Abstraction Levels and Synthesis**



Silicon compilation (not a big success)

System Specification





















# **System Specification**

- Define overall goals and requirements
  - Functionality
  - Performance/Power/Area
- Format
  - Textual description (English)
  - Flowcharts
  - State transition graphs
  - Timing charts
  - Executable specification (SystemC)



#### Architectural Design

- Determine the architecture to meet the specifications
  - Microarchitecture details
    - Branch predictor
    - Caches
    - Datapath and Control Unit
  - Usage of hard IP blocks
  - Choice of process technology
- Simulators used to validate architecture
  - Check functional correctness and perform DSE



# **Functional and Logic Design**

- System described using HDL
  - Programming language to describe hardware
  - Behavioral description
  - Register-Transfer Level (RTL) description
- Two most common HDLs
  - VHDL
  - Verilog

# Why HDL?

- Allows write-run-debug cycle for hardware development
  - Similar to software development
- Higher productivity than using schematics
  - Easier to describe complex circuits with millions of gates
- Input to synthesis EDA tools
  - Only the synthesizable subset
- Design space exploration with simulation

# Why HDL?

- Why not use a general purpose language?
  - Support for structure and instantiation
  - Support for describing **bit-level** behavior
  - Support for **timing**
  - Support for **concurrency**

#### HDLs

- Verilog
  - Close to C
  - More than 60% of the world digital design market (larger share in US)
  - IEEE standard since 1995
  - Extensive support for Verification (SystemVerilog)
- VHDL
  - Close to Ada
  - Large standard developed by the US DoD



#### **Hardware Verification**

- Verify the functional correctness of the design
- Verification methodology
  - Prepare verification plan from specification
  - Prepare testbenches
    - Instantiate Design Under Test (DUT)
    - Provide stimuli to the DUT
    - Check DUT's output
  - Run testbenches using HDL simulator
    - Verilog: icarus verilog, verilator
    - VHDL: GHDL
    - VCS, ModelSim













#### **Hardware Verification**

- Directed tests
  - "Manually" prepare input stimuli (test vectors)
  - Only find bugs you are looking for
- Constrained Random Verification
  - Randomize input stimuli with restrictions
  - How do we know which functionality has been tested?
- Coverage
  - Code coverage
  - Functional coverage

#### **Verification tools**

- Universal Verification Methodology (UVM)
  - Class library written in SystemVerilog
  - Very powerful, but very complex
    - Over 300 classes in UVM!
  - Grad students unlikely to have prior experience with SystemVerilog
- Open Source VHDL Verification Methodology (OSVVM)
  - Library written in VHDL
  - Similar to UVM
# SystemVerilog Complexity



http://www.fivecomputers.com/language-specification-length.html

## Cocotb

- Coroutine Cosimulation TestBench
  - Write testbenches in Python!
  - VHDL and Verilog
  - Interface to RTL simulators
    - Icarus, GHDL, ModelSim, VCS...
- Cocotb-coverage
  - Constrained Random Verification
  - Functional Coverage
  - Regression Tests





# Logic Synthesis

- Converts RTL description to gate-level netlist
  - Automatically performed by EDA tool
    - Synopsys Design Compiler
    - Yosys
  - Standard cell library
    - Collection of low-level electronic logic functions
      - AND, OR, INVERT, flip-flops, latches...
      - Layout, schematic, timing, power...
    - Process Development Kit (PDK)

# Standard Cell Design

- Design circuits using standard cells
  - Cells: gates, latches...
- Technology mapping selects cells
- EDA tools
  - Place and route the cells
  - Wiring in channels
  - Minimize area, delay

# Qflow - Yosys

- Framework for Verilog RTL synthesis
- Implements multiple optimization passes
- Input:
  - Synthesizable Verilog code (Verilog-2005)
  - Standard cell library
- Output:
  - Gate-level netlist: BLIF, Verilog
  - Post-synthesis verification using same testbenches

RTL code

```
assign {cout, C} = A + B;
```

endmodule

• Yosys – Read and process Verilog



Yosys – Map to internal technology library



- Yosys/ABC:
  - Map to standard cell library



• Yosys – Gate-level netlist in Verilog

```
module adder(A, B, C, cout);
  wire 00 ;
 wire 01;
 wire 02 :
 wire _03_;
  input [1:0] A;
  input [1:0] B;
  output [1:0] C;
  output cout:
  NAND2X1 _04_ (.A(A[0]), .B(B[0]), .Y(_03_) );
  XOR2X1 _05_ (.A(A[0]), .B(B[0]), .Y(C[0]) );
  NAND2X1 _06_ (.A(A[1]), .B(B[1]), .Y(_00_) );
  NOR2X1 _07_ (.A(A[1]), .B(B[1]), .Y(_01_) );
  XOR2X1 _08_ (.A(A[1]), .B(B[1]), .Y(_02_));
  XNOR2X1 _09_ (.A(_03_), .B(_02_), .Y(C[1]) );
  OAI21X1 _10_ (.A(_03_), .B(_01_), .C(_00_), .Y(cout) );
endmodule
```



# **Physical Design**

- Produce a geometric chip layout from an abstract circuit design
- Inputs:
  - Gate-level netlist
  - Standard cell library
  - Constraints
- Output:
  - Silicon layout (geometric description of the chip)

## **Physical Design - Steps**

- Partitioning
- Chip planning
- Placement
- Global routing
- Detailed routing

#### Partitioning

- Breaks up a large circuit into smaller subcircuits (blocks)
  - Number of blocks
  - Interconnection between blocks



# Floorplanning

- Set up a plan for a good layout
- Place the blocks at early stage when details like shape, area, position of I/O pins... are not yet fixed

![](_page_51_Figure_3.jpeg)

# Floorplanning

 Blocks are placed in order to minimize area and the connections between them

![](_page_52_Figure_2.jpeg)

#### Placement

- Exact placement of the modules
  - Modules can be gates, standard cells...
- Details of design are known
  - Goal is to minimize total area and interconnect cost

![](_page_53_Figure_5.jpeg)

![](_page_53_Picture_6.jpeg)

#### Placement

- Blocks are placed so empty rectangular spaces are left between them
- These spaces will be later used to make the interconnection

![](_page_54_Figure_3.jpeg)

## Routing

- Completes the interconnections between modules
- Considers delay of critical path, wire spacing...
- Global routing and detailed routing

![](_page_55_Figure_4.jpeg)

- Feedthrough
  - Standard cell type 1
  - Standard cell type 2

## **Global Routing**

 Each connection will go from each origin block, through the channels until the end block

![](_page_56_Figure_2.jpeg)

## **Global Routing**

 The length of the connections will depend on the situation of the blocks rather than the way the routing is done

![](_page_57_Figure_2.jpeg)

 The space required for each channel will depend on the complexity and density of the connections

![](_page_58_Figure_2.jpeg)

 Non-aligned connections increment the complexity of the routing, increasing the amount of space required

![](_page_59_Figure_2.jpeg)

• The number of crossings determines the space required

![](_page_60_Figure_2.jpeg)

 Aligned connections reduce dramatically the area required of the channel from completing routing

![](_page_61_Figure_2.jpeg)

## **Clock Network Distribution**

 Determines routing of the clock signal to meet the delay requirements

![](_page_62_Figure_2.jpeg)

#### **Power Distribution**

- Interlaced distribution minimizes number of metal levels
- Thickness will vary according to power consumption

![](_page_63_Picture_3.jpeg)

![](_page_64_Figure_1.jpeg)

#### Testing

- Check the correctness of the layout
- Design rule checking (DRC)
  - Verifies that layout meets all technology-imposed constraints
- Layout vs schematic (LVS)
  - Verifies the functionality of the design
  - A netlist from the layout is derived and compared with original netlist
- Electrical rule checking (ERC)
  - Verifies the correctness of power and ground connections, signal transition times...

![](_page_66_Figure_1.jpeg)

#### Fabrication

- The final layout (GDSII Stream format) is sent to a dedicated silicon foundry (fab)
- Tapeout
  - Final result of the design process for integrated circuits before they are sent for manufacturing

![](_page_67_Figure_4.jpeg)

![](_page_68_Figure_1.jpeg)

![](_page_69_Figure_1.jpeg)

![](_page_70_Figure_1.jpeg)

![](_page_71_Figure_1.jpeg)
### **VLSI Design Flow**



#### **Cocotb - Verification Tool**



#### **Cocotb - Verification Tool**





75 / 77

# **Qflow - Synthesis Tool**



http://opencircuitdesign.com/qflow/

## Bibliography

- Weste, N. H., & Harris, D. "CMOS VLSI Design : A Circuits and Systems Perspective".
  4<sup>th</sup> Edition, 2010.
- Kahng AB, Lienig J, Markov IL, Hu J. "VLSI Physical Design: from Graph Partitioning to Timing Closure". Springer Science & Business Media; 2011 Jan 27.
- Mead, C., & Conway, L. (1980). Introduction to VLSI systems (Vol. 1080). Reading, MA: Addison-Wesley.