

Kevin K. Yee (Samsung), Fakhruddin Ali Bohra (Arm), Edson Gomersall (Cadence) Arm TechCon 2019 San Jose Oct. 9, 2019







# SAMSUNG FOUNDRY High-Performance 5LPE Implementation Next-Generation Arm "Hercules" CPU

Kevin K. Yee Marketing Director, IP & Ecosystem Samsung Foundry







## Cadence, Arm, Samsung Foundry Partnership









## Cadence, Arm, Samsung Foundry Collaboration

#### **Enabling the Next Level of Innovation**



**Delivering the Most Optimized High-Performance Solution!** 







## Samsung Foundry Technology Leadership

Leading the industry with many 1st







## Why 5LPE?

#### Full EUV technology – ready for production

- 2nd gen. EUV technology
- # 7LPP compatibility: IP re-use, yield leverage, easy migration
- Enhanced features to 7LPP: MDB, SDB, single fin cell, CB on RX edge









#### **5LPE Process Overview**

#### Easy migration from 7LPP

#### IP re-usability with boosted transistor performance

- GR compatibility, same SRAM, same TR structure
- 1.11x perf boosted by Low-k spacer, DC enhancement, etc.

#### **Smart scaling with 6T SC offering**

0.70x block-level area due to 6T with SDB and CB on RX edge

#### Superior power efficiency by 6T SDB with 1-Fin

0.80x block-level power





#### **Migration Enhancement**



#### Superior power efficiency









## 5LPE Technology Enhancements

Enabling 36M2, SDB, 1FIN, CB on RX-edge & 6T







|                 | 7LPP                          | 5LPE            | Note       |
|-----------------|-------------------------------|-----------------|------------|
| FP              | 27FP                          |                 | Compatible |
| СРР             | 54CPP/60CPP                   |                 | Compatible |
| SRAM bitcell    | P026/P032                     |                 | Compatible |
| Vop, Vth, Lgate | 0.75V, RVT/LVT/SLVT, 8nm/10nm |                 | Compatible |
| M1/M3/M4        | 40nm/36nm/44nm                |                 | Compatible |
| M2              | 48nm + <b>36nm</b>            |                 | Enhance    |
| Diffusion Break | MDB                           | + SDB           | Enhance    |
| Min NFIN        | 2FIN + 1FIN                   |                 | Enhance    |
| СВ              | CB on STI                     | + CB on RX-edge | Enhance    |
| Standard Cell   | 7.5T                          | + 6T            | Enhance    |





## **5LPE Key Feature Comparison**

7.5T for performance, 6T for area & power

**T.5T:** Performance driven with 60CPP and MDB

6T: Area & power driven with CB of RX edge, SDB and 36M2

| Track            |              | 7.5T [HD] | 6T [UHD]      |  |
|------------------|--------------|-----------|---------------|--|
| Cell Height [nm] |              | 270       | 216           |  |
| Tech Definition  | FP [nm]      | 27        | 27            |  |
|                  | CPP [nm]     | 60        | 54            |  |
|                  | M1 [nm]      | 40Bi      | 40Uni         |  |
|                  | M2 [nm]      | 60        | 36            |  |
|                  | DB           | MDB       | SDB           |  |
|                  | СВ           | CB on STI | CB on RX edge |  |
| Design Feature   | FIN#         | 3:3       | 2:2           |  |
|                  | Signal Track | 6         | 5             |  |







## **5LPE Broad Market Adoption**

#### 7.5T for performance, 6T for area & power

**4018:** 7nm production tapeout

**3019:** 5nm production tapeout

# 2020: Main node will be 5nm







## **5LPE** Availability

#### Key milestones

PDK v1.0: Released Jan'19

**L1+168H: Completed Apr. '19** 

**MPW MTO: Feb'19** 







# Samsung Advanced Foundry Ecosystem



# Join us!

9am – 7pm October 17, 2019 SAMSUNG@FIRST CAMPUS

**REGISTER NOW!:** 

https://events.samsungatfirst.com/foundry-safe-forum-2019/





Fakhruddin Ali Bohra Senior Principal Design Engineer Arm







## Arm Artisan Physical IP and Samsung Foundry Partnership

Over 3 billion chips shipped with Arm Artisan Physical IP



**arm** DESIGNSTART







### Arm Artisan on Samsung Foundry 5LPE (EUV) Platform

#### **Standard Cell**



- Comprehensive cell set for broad application coverage
- 7.5 & 6 UHD Track Library
- Power management & ECO



#### **Memory Compiler**

- 8 memory compilers
- Low VDD assist features
- Multi periphery Vt options
- Extensive features set

#### Physical Design Group Products

**arm** Artisan

\*\*\*\*\*\*\*\*\*\*\*\*\*



#### **GPIO**

- GPIO 1.8V
- GPIO 3.3V

#### Architect

- Logic libraries
- Memory compiler

#### POP<sup>TM</sup> IP

- Cortex-A76
- Cortex-A55
- Hercules
- All POP with DynamIQ™

#### **Complete Physical IP Platform**

- Foundry sponsored foundation IP
   Support wide range of applications from mobile, consumer and automotive
- POP IP for "Hercules" CPU, Arm® Cortex®-A76 and Cortex-A55 based computing systems
- Design kits available







## Technology Trends that Will Redefine All Industries









### Al is Everywhere

Industry landscape: What trends will drive the industry

- Al cuts across applications spaces
- Projected
   continued growth
   across all
   applications



Source : Tractica (2018)







#### Breakthrough Performance for Always-On, Always-Connected

Continuing the trajectory of increased compute performance for AI, ML and premium mobile



Cortex-A76: High-performance implementation: 3+ GHz in 7nm

**Cortex-A77: Up to 20% improved IPC performance** 

Hercules: Continuing performance and efficiency leadership

Supports the flexibility of Arm DynamIQ big.LITTLE™







#### Translating Arm RTL Benefits into Silicon

# **HOW TO**

Translate
the year-over-year
performance
improvements of
>15% for compute
through 2020 in silicon

?

Optimize implementation for new cores and advanced process nodes

Ensure **fast turn- around time** such that increased productivity of new Arm cores

?







## Arm POP IP is Optimized Arm Core Implementation

#### **POP IP Components**



**Comprehensive support and services** 







## Design Technology Co-Optimization (DTCO) Benefits the Ecosystem









© 2019 Arm Limited

## **Implementation Challenges**

Challenges present for advanced nodes and cores

colorless routing



#### **Latest Arm Cores**

- Concurrent configuration of CPU with DynamIQ Shared Unit (DSU)
- Asynchronous configuration
- Long channels between
   DSU and CPU slaves
- Private L2 cache
- Architectural clock gating



5<sub>nm</sub>

Via ladders

rules

Power grid

Addressing

variation

challenges

New placement





#### **5LPE: Continued Enhancements for Artisan Logic IP**

#### Performance

- New compressor cells
- New flip-flops: 30% performance gain per cell level

#### Power

- New multi-bit functional cells
- Low power flip-flops

#### Density

- Key combinatorial cells: 5-10% area gains per cell level
- Flip-flops: 10+% area gains per cell level

#### Implementation

- Cells crafted to enable flexible cell placement while maintaining a tight and regular lower grid
- Utilities included to ease power grid generation
- Post-route opportunistic M2 stitching supported to improve EM/IR drop
- Large clock drivers for ideal H-Tree implementation
- Power gate design ensures no break in grid regularity



Dependencies exists between standard cell architecture & power grid;
Proper grid required for optimization







#### 5LPE: Continued Enhancements for Artisan Memory IP



#### Performance

- Special optimization for key compilers
- Bitcell selection to match 5LPE target markets
- Multiple level shifting options for optimized DVFS at SOC level
- Careful wireplaning to mitigate RC

#### Power

- Compile-time options for core and periphery separation, control, retention and power down
- Multiple architecture and micro architecture changes

#### Density

- Extensive MUX, bank and slice options
- Innovative assist scheme
- Newer topological options

**Implementation** 

• Instance layout minimizes placement and abutment restrictions









Edson Gomersall Product Engineering Architect Cadence







## Cadence, Arm, and Samsung Foundry Collaboration

Worldwide engineering relationship





• Collaboration enables high-performance flow to be delivered

(\*) Early collaboration ensures tool support in place







## Arm CPU 5LPE Digital and Signoff Flow









## Arm CPU 5LPE Design Configuration

#### Process

- Samsung Foundry 5LPE
- 13 Layer Metal stack

#### CPU Core Overview

- Arm CPU 5LPE POP 7.5T standard cells
- Arm CPU 5LPE POP FCI memories
- Arm CPU floorplan
- Multiple voltage islands









## Cadence and Samsung Foundry 5LPE Co-Optimization

#### **5LPE Process and Tool Co-Optimization**



 One simple option to enable all 5LPE process support

setDesignMode -process 5 -node S5

| Metric      | After eGR tuning |  |
|-------------|------------------|--|
| Frequency   | 14% improvement  |  |
| Utilization | 7% improvement   |  |

Early Global Route Tuning

|               | M4   | D5   | D6   | D7   | D10  |
|---------------|------|------|------|------|------|
| Before tuning | 80.5 | 77.9 | 84.2 | 51.5 | 43.2 |
| After tuning  | 81.4 | 79.7 | 85.5 | 52.7 | 61.3 |

Layer Promotion Adherence

**Out-of-Box Better Full Flow PPA** 







#### 5LPE and CPU Characteristic Driven Features

| 5LPE-driven features      | CPU-driven features        |  |
|---------------------------|----------------------------|--|
| Early global route tuning | Floorplan regions/guides   |  |
| Layer promotion           | ICG placement optimization |  |
| NDRs for specific layers  | Custom cost groups         |  |
| Cell legalization         | Strategic pre-skewing      |  |
| Statistical Via support   | Clock tuning               |  |









## Cadence Digital Flow Benefits

- Combined Genus<sup>™</sup> and Innovus<sup>™</sup> solution with common optimization
  - iSpatial technology
  - 2x TAT improvement, 5% PPA benefit
- IR drop-aware flow
  - Integrated IR drop repair flow
- True signoff
  - Completely integrated with industry-leading Innovus Implementation System
  - Tempus™ and Voltus™ solutions combined is TRUE SIGNOFF
  - Via variation-aware timing signoff
- Stylus common UI
  - Usability and productivity enabler













## Arm CPU 5LPE iSpatial High-Performance Flow

Turn-around time and predictability benefits



Mesh and Flex-H clock distribution supported which can influence design performance







## 5LPE Statistical Via Timing Analysis and Optimization



- Exclusive feature used in 5LPE signoff flow
- Global Via resistance variation modelled in extraction technology files
- Local Via resistance variation is modelled statistically by the foundry
  - Samsung Foundry provides mean shift and standard deviation of via cut sizes
  - Quantus<sup>™</sup> Extraction provides via resistance for each via
- Tempus<sup>™</sup> delay calculation engine uses both global and local resistance variation to ensure timing convergence during the flow
  - Part of the Samsung Foundry 5LPE signoff requirements
- Tempus ECO optimization accounts for all Via resistance effects







## Concurrent Timing, Power, and IR Drop Optimization

Innovus, Tempus, Voltus single data model integration

# **Multi-Stage IR Correction** RTL Compile **iSpatial** Incremental PlaceOpt CCOpt™ Route Opt Quantus™, Tempus™, Voltus™

**Early Rail Analysis** 

**IR Drop-Aware Placement** 

Clock Useful Skew for Peak Power

Power Routing and Via Optimization

Timing And IR Drop-Aware ECO









## Full-Flow Unified Interface and Flow for Digital Implementation

# Common User Interface



#### **Unified Metrics**





Consistent PPA reporting across the whole digital flow

# Flow Capture and Automation







and deploy to users





Improved ease of use and designer productivity

## Arm CPU 5LPE High-Performance Rapid Adoption Kit



- Complete Cadence RTL-to-GDS digital implementation flow
  - Example flow scripts
  - Example floorplan
  - Application note explaining how to setup the RAK
  - Application notes showing how the flow works
- Customized for latest advanced Arm CPU and Samsung Foundry 5LPE process
- Available to customers enabling fast implementation of latest Arm CPU







## Arm Advanced CPU 5LPE High-Performance RAK Results



- Arm, Samsung Foundry, and Cadence working together have enabled over 30% performance improvements
- More to come as collaboration continues
- Customers have immediate access to improved results using RAK

Over 30% performance improvement achieved so far .. More to come







## Delivering Design Excellence for 5LPE Arm CPU Implementation



- Complete Cadence high-performance digital implementation and signoff flow
- Fully qualified for Samsung Foundry 5LPE
  - 5LPE process and tool co-optimization
  - One simple command to enable 5LPE set-up
  - 5LPE Statistical Via Timing Analysis and Optimization
- Tuned and proven on Arm high-performance CPU and POP library for outstanding PPA
- Common user interface and FlowKit included as part of Rapid Adoption Kit







# arm

# SAMSUNG FOUNDRY

cādence®

© 2019 Cadence Design Systems, Inc. All rights reserved worldwide. Cadence, the Cadence logo, and the other Cadence marks found at <a href="www.cadence.com/go/trademarks">www.cadence.com/go/trademarks</a> are trademarks or registered trademarks or Cadence Design Systems, Inc. Accellera and SystemC are trademarks of Accellera Systems Initiative Inc. All Arm products are registered trademarks or trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere. All MIPI specifications are registered trademarks or service marks owned by MIPI Alliance. All PCI-SIG specifications are registered trademarks or trademarks are the property of their respective owners.