## Page 3 Investments Workshop

Part of the

Electronics Resurgence Initiative

July 18, 2017

Reference herein to any specific commercial product, process, or service by trade name, trademark or other trade name, manufacturer or otherwise, does not necessarily constitute or imply endorsement by DARPA, the Defense Department or the U.S. government, and shall not be used for advertising or product endorsement purposes.



Nytimes.com

"The purpose of this directive is to provide within the Department of Defense an agency for the direction and performance of certain advanced research and development projects."

> February 7, 1958 NUMBER 5105.15



- The Agency is authorized to direct such research and development projects being performed within the Department of Defense as the Secretary of Defense may designate.
- The Agency is suborised to arrange for the performance of research and development work by other agencies of Oovernment, including the military departments, as may be necessary to accomplish its mission in relation to project massigned.

#### How do we operate?





#### How do we operate?



#### DARPA has evolved to using challenges











#### The miracle of Moore's Law has taken us incredibly far...



FCRP – Focus Center Research Program FinFET – Fin-Shaped Field Effect Transistor SEMATECH – Semiconductor Manufacturing Technology MOSIS – Metal Oxide Semiconductor Implementation Service \*Microelectronics Manufacturing Science and Technology (MMST)

AME – Advanced Microelectronics

#### Page 2 set us on a 50 year journey



P.1

P.2

"...The complexity for minimum component costs has increased at a rate of roughly a factor of two per year (see graph)..."

#### ... but nothing lasts forever



"The total cost of making a particular system function must be minimized"

- Gordon Moore

AME – Advanced Microelectronics FCRP – Focus Center Research Program FinFET – Fin-Shaped Field Effect Transistor SEMATECH – Semiconductor Manufacturing Technology MOSIS – Metal Oxide Semiconductor Implementation Service \*Microelectronics Manufacturing Science and Technology (MMST)

#### We need to turn the page



## Architecture

Maximizing specialized functions

#### flexible techniques for the engineering of large functions so that no disproportionate expense need be borne by a particular array. Perhaps newly

devised design automation procedures could translate from logic diagram to technological realization without any special engineering.

Clearly, we will be able to build such component-

engineering over several identical items, or evolve

crammed equipment. Next, we ask under what circumstances we should do it. The total cost of making a particular system function must be minimized. To do so, we could amortize the

VIII. DAY OF RECKONING

It may prove to be more economical to build large systems out of smaller functions, which are separately packaged and interconnected. The availability of large functions, combined with functional design and construction, should allow the manufacturer of large systems to design and construct a considerable variety of equipment both rapidly and economically.

### Materials & Integration

Adding separately packaged novel materials and using integration to provide specialized computing

DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited.

Design

Quickly enabling

specialization

#### What are the national capabilities we should be investing in next?



Sources: Intel; press reports; Bob Colwell; Linley Group; IB Consulting; The Economist

DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited.

"The total cost of making a particular system function must be minimized"

- Gordon Moore

#### Pseudolithic Integration



#### Specialized Hardware Blocks



#### Software Hardware Co-design



# Where are we heading?

Sowing the seeds for a revolution in processing



## What is the initiative?

Program managers hired directly from the electronics community...

Aligning incentives as we both stare at an uncertain future

Co-developing electronics to manage the coming inflection to support both a national electronics base and national defense

#### MTO ELECTRONICS RESURGENCE INITIATIVE TIMELINE







#### NATIONAL ELECTRONICS CAPABILITY





#### **Traditional Programs Currently Funded**

- **JUMP** Joint University Microelectronics Program
- **CHIPS** Common Heterogeneous Integration and IP Reuse Strategies
- **HIVE** Hierarchical Identify Verify Exploit
- **L2M** Lifelong Learning Machines
- **N-ZERO -** Near-Zero Power Radio Frequency Receivers
- **CRAFT** Circuit Realization at Faster Time Scales
- **SSITH** System Security Integrated Through Hardware and firmware

•

#### Joint University Microelectronics Program (JUMP)





#### **MTO Electronics Timeline**



#### The goal of the Electronics Resurgence investment today is to reach a national capability between 2025 and 2030



"The total cost of making a particular system function must be minimized"

- Gordon Moore

# So how do you get involved?

Timeline and structure

#### MTO ELECTRONICS PAGE 3 INVESTMENTS TIMELINE



## Dan Green

Materials

Steering the science of materials to commercial product lines



## Tom Rondeau

Architectures

The intersection of connectivity and computation



## Andreas Olofsson

Designs

From Kickstarter to Supercomputer





## ENSURING LONG-TERM U.S. LEADERSHIP IN SEMICONDUCTORS

**ELECTRONICS RESURGENCE INITIATIVE WORKSHOP** 

FAIRMONT SAN JOSE, 170 S MARKET ST, SAN JOSE, CA 95113

JULY 18 – JULY 19, 2017

CRAIG MUNDIE

Se

#### **REPORT TO THE PRESIDENT** Ensuring Long-Term U.S. Leadership in Semiconductors

Executive Office of the President President's Council of Advisors on Science and Technology

January 2017



## PCAST WORKING GROUP

#### **Co-Chairs**

| <b>John Holdren*</b><br>Director, OSTP | Assistant to the President for Science and Technology & |
|----------------------------------------|---------------------------------------------------------|
| Paul Otellini                          | Former President and CEO Intel                          |
| Industry Working Group Members         |                                                         |
| Richard Beyer                          | Former Chairman and CEO Freescale Semiconducto          |
| Ajit Manocha                           | Former CEO Global Foundries                             |
| Wes Bush                               | Chairman, CEO, and President Northrop Grumman           |
| Jami Miscik                            | Co-CEO and Vice Chairman Kissinger Associates           |
| Diana Farrell                          | President and CEO JP Morgan Chase Institute             |
| Craig Mundie*                          | President Mundie & Associates                           |
| John Hennessy                          | President Emeritus Stanford University                  |
| Mike Splinter                          | Former CEO and Chairman Applied Materials               |
| Paul Jacobs                            | Executive Chairman Qualcomm                             |
| Laura Tyson                            | Distinguished Professor - Graduate School UC Berkeley   |

## CHALLENGES AND OPPORTUNITIES

- TECHNOLOGICAL BARRIERS TO LOWER-POWER AND SCALING
- RAPIDLY SHIFTING GLOBAL MARKETS
- STRATEGIC POLICY AND FINANCIAL INVESTMENTS OUTSIDE USA
- MARKET ACCESS CONSTRAINTS
- UNEVEN INTELLECTUAL PROPERTY ENFORCEMENT
- FAB CAPACITY IN USA NOW LESS THAN 13%
- DESIGN COMPLEXITY AND DEGREE OF SPECIALIZATION INCREASING

### WE'VE SEEN THIS MOVIE BEFORE...

- IN THE 1980'S JAPAN WAS OVERTAKING THE U.S. IN MEMORY CIRCUITS
- BUT, THE MARKET WAS SHIFTING, DRIVEN BY MICROPROCESSOR ADVANCES
- THE USA POLICY AND INDUSTRY FOCUS WAS ON SPEED IMPROVEMENT AND TECHNOLOGY FUNDAMENTALS, AND THE JAPANESE FELL BEHIND
- KOREA, MORE RECENTLY, HAS MADE BIG INVESTMENTS
- CHINA IS INVESTING STRATEGICALLY
- A SUCCESSFUL U.S. STRATEGY TODAY MUST BE DIFFERENT...

# WIN THE RACE BY RUNNING FASTER!

- PICK FOCUS AREAS MOONSHOTS
- APPLICATIONS-DRIVEN APPROACH
- TEN-YEAR TIME HORIZON
- GOVERNMENT INVESTMENT SHOULD COMPLEMENT NATURAL INDUSTRY INVESTMENT AREAS
- REDUCE DESIGN COSTS WITH RADICAL ADVANCES IN DESIGN TOOLS AND REUSABILITY – GOAL SHOULD BE 10X TO 100X REDUCTIONS IN TIME AND COSTS

# BUT IT'S LIKE PLAYING 3D CHESS...

- THE RULES OF THE GAME ARE DETERMINED BY THE APPLICATION DOMAIN
- THE PLANES OF THE GAME INCLUDE:
  - Computing Modalities
  - Computing Architectures
  - Component Technologies

# APPLICATION DOMAIN LEADERSHIP & SUPPORT ROLES

### STRONG TECH INDUSTRY INTEREST (GOVERNMENT SUPPORT)

- **Big Data Analytics:** Local real-time data analysis and visualization enabled by advances in security, low-power computation, and processor specialization.
- Artificial Intelligence and Machine Learning: Supervised and unsupervised machine learning enabled by new processors, including low-power processers, graphics processing units, and quantum computers.
- Biotechnologies, Human Health Technologies: Medical implants that are capable of ultra-low power processing, communications, and wireless charging.
- Robotics, Autonomous Systems: Speech and image recognition for mobile computing.
- Telepresence, Virtual Reality, Mixed Reality: Local real-time sensory input, such as video and graphics.
- Machine Vision: Imaging-based automatic inspection and analysis for applications such as process control and robot guidance.
- Speech Recognition and Synthesis: Portable systems enabling recognition and artificial production of human speech.
- Nanoscale Systems and Manufacturing: Democratized, small-batch fabrication structures at the nanoscale using a variety of material classes. Nanoscale 3D Printers will provide desktop fab capabilities for rapid prototyping, additive manufacturing, moving beyond silicon and interfacing with soft matter.
- Ultra-High Performance Wireless: Wireless systems with very low latency and extremely reliable communications, for example, between autonomous vehicles.
- Holistic Secure Systems: hardware-based defense in-depth, such as tamper resistant hardware what electronically authenticates software integrity.

### WEAKER TECH INDUSTRY INTEREST (GOVERNMENT LEADERSHIP)

- Computational Chemistry: Design of novel solutions for catalysis, low-temperature nitrogen fixation, etc.
- Advanced Materials Science and Manufacturing: Simulation of solid state materials, etc.
- Modeling and Simulation: Efficient exascale computing to enable advanced earthquake prediction (CMOS-based high-performance computing capable of 1-10 exaflops), high-fidelity weather modeling (superconducting-based hyperscale computing capable of 10-100 exaflops), and optimization problems (quantum computing).
- Space Technologies: Radiation hardness through circuit design and technologies (e.g., widebandgap electronics) rather than special manufacturing processes (e.g., insulating substrates or shielding).

# TAKING A FULL-STACK APPROACH DOMAIN BY DOMAIN...

1. ULTIMATE SOFTWARE APPLICATION

2. APPLICATION PROGRAMMING MODEL

3. PLATFORM SOFTWARE SERVICES

4. PLATFORM PROGRAMMING MODEL

5. OPERATING SYSTEMS SERVICES

6. COMPUTER SYSTEM ARCHITECTURES (PROCESSING, STORAGE, AND INTERCONNECT AT EVERY SCALE)

7. COMPONENT TECHNOLOGIES

# COMPUTING MODALITIES

**EMBEDDED SYSTEMS:** SPECIALIZED SEMICONDUCTORS, RANGING FROM HIGH-VOLUME/LOW-COST FOR APPLICATIONS LIKE INTERNET OF THINGS (IOT) DEVICES TO LOW-VOLUME/HIGH-COST SEMICONDUCTORS FOR ROBOTICS OR DEFENSE SYSTEMS. POWER EFFICIENCY REQUIREMENTS WILL VARY BY APPLICATION (HARVESTING ENERGY FROM THE AMBIENT ENVIRONMENT VERSUS DEDICATED POWER SOURCES, RESPECTIVELY). FLEXIBILITY AND AGILITY IN FABRICATION AND DESIGN WILL BE NEEDED TO MAINTAIN PROFITABILITY.

**PERSONAL/PORTABLE SYSTEMS:** DESKTOP, MOBILE, AND WEARABLE COMPUTING DEVICES. THESE ARE FREQUENTLY BATTERY-POWERED COMPUTATIONAL DEVICES, WHICH WILL BE OPTIMIZED FOR PERFORMANCE, PRICE, AND POWER EFFICIENCY. GENERAL PURPOSE COMPUTING WILL BE AUGMENTED BY ACCELERATORS, SENSOR ADD-ONS, AND OTHER FUNCTION-AUGMENTING ICT'S.

**HYPERSCALE SYSTEMS:** SUPERCOMPUTING DEVICES FOR "REMOTE" COMPUTATION THAT WILL BE AGGREGATED TO FORM THE MOST POWERFUL SYSTEMS THAT CAN BE PRODUCED IN EACH ARCHITECTURAL CLASS. THESE SYSTEMS ARE EXPECTED TO SOLVE OTHERWISE INTRACTABLE PROBLEMS; OR, FOR CLASSICAL ARCHITECTURES, TO MAXIMIZE PERFORMANCE WITHIN PRACTICAL POWER CONSTRAINTS. EMERGING ARCHITECTURES PROVIDING NEW CAPABILITIES AND DOMAIN-SPECIFIC OPTIMIZATIONS WILL BECOME INCREASINGLY IMPORTANT AS PERFORMANCE INCREASES LAG AND PRACTICAL POWER LIMITS ARE REACHED IN TRADITIONAL COMPUTING ARCHITECTURES.

# COMPUTER SYSTEM ARCHITECTURES

**VON NEUMANN:** CHANGES IN TECHNOLOGY TO ACCOMMODATE POST-MOORE'S LAW REALITIES, SUCH AS MULTI-CORE CPUS WITH DIFFERENT, COMPLEX MEMORY HIERARCHIES, WILL DEMAND NEW ENGINEERING PARADIGMS ACROSS THE EXISTING RANGE OF TRADITIONAL VON NEUMANN ARCHITECTURES FOR DIGITAL COMPUTATION.

**QUANTUM:** QUANTUM COMPUTING HAS THE POTENTIAL TO SUBSTANTIALLY ADVANCE OUR COMPUTE CAPABILITIES AND SOLVE CURRENTLY INTRACTABLE PROBLEMS. THERE ARE SEVERAL QUANTUM ARCHITECTURAL APPROACHES WHICH MAY SUPPORT DIFFERENT STRATEGIC DOMAINS, AND ALONG DIFFERENT TIMELINES. THESE APPROACHES, IN ROUGH ORDER OF LIKELY DEPLOYMENT, ARE: ANALOG QUANTUM SIMULATION; ADIABATIC QUANTUM ANNEALING; AND CIRCUIT-BASED QUANTUM COMPUTING.

**BIO/NEURO-INSPIRED (NEUROMORPHIC COMPUTING):** BIOLOGICALLY-INSPIRED POWER CONSUMPTION AND "TOPOLOGY" OF THE CIRCUITRY (USING THREE DIMENSIONS, MORE LIKE THE BRAIN), ANALOGOUS TO HOW RADIO NETWORKS ARE NOW DESIGNED IN THE POST-SHANNON LIMIT ERA.

**ANALOG COMPUTING:** ANALOG COMPUTING APPROACHES PREDATE DIGITAL COMPUTING AND IN THEORY CAN SOLVE SOME PROBLEMS THAT ARE INTRACTABLE ON DIGITAL COMPUTERS. IN PRACTICE, DIGITAL COMPUTING TECHNIQUES HAVE OVERTAKEN ANALOG COMPUTING, BUT ADVANCES IN NOISE MINIMIZATION COULD ALLOW SOLUTIONS IN SOME AREAS.

SPECIAL PURPOSE ARCHITECTURES: FIELD-PROGRAMMABLE GATE ARRAYS, GRAPHICS PROCESSING UNITS, AND DEEP LEARNING/MACHINE LEARNING ACCELERATORS, INCLUDING FOR EDGE COMPUTING.

APPROXIMATE COMPUTING: PERFORMING BOUNDED APPROXIMATION INSTEAD OF EXACT CALCULATIONS FOR ERROR-TOLERANT TASKS (SUCH AS MULTIMEDIA PROCESSING, MACHINE LEARNING, AND SIGNAL PROCESSING), SIGNIFICANTLY INCREASING EFFICIENCY AND REDUCING ENERGY CONSUMPTION.

# COMPONENT TECHNOLOGY VECTORS AND TIMELINES

### 1 TO 4 YEARS

- Neuromorphic
- Photonics
- Advanced and Quantum Sensors
- CMOS Sub 7nm and 3D structures
- Magnetic Flash and DRAM Memories
- 3D Wafer Stacking
- 5G wireless technologies

### 5 TO 7 YEARS

- Magnetic SRAM
- 3D Die-to-Wafer Stacking
- 3D Monolithic Fab
- Advanced non-volatile SRAM
- Carbon Nanotubes
- Phase Change Materials
- Biotech-to-electronic interfaces
- Superconducting Logic, Interconnects and Storage

### 7 TO 10+ YEARS

- 6G wirelesss technologies
- Quantum Computers
- DNA Storage

# WE HAVE MORE THAN ENOUGH TECHNOLOGIES

WE JUST HAVE TO PICK A FEW BIG PROBLEMS TO DRIVE THEM INTO COMMERCIALIZATION

## Data, Computation, and Electronics

### Wade Shen

DARPA Program Manager





- I2O = Information Innovation Office @ DARPA
- Data analysis and machine learning for national security:
  - Detecting ceasefire violations in Yemen
  - Finding human traffickers from their online ads
  - Machine learning that patches bugs in real-time
  - Tracking targets at the speed of a bullet
  - Machine learning that builds machine learning
- Why do we need better compute capabilities?



 Data: Publicly available social media + seismic activity data from WWSSN

**1.** Anomaly detection finds events via social media and seismic data



**2.** Image understanding helps characterize the event



- asphalt:
  0.76836807
- flooring:
  0.65001416
- rubble:
  0.62622625
- construction: 0.61434084
- vehicle: 0.85797721
- sahara: 0.56392485
- military vehicle: 0.56342363

WWSN - World Wide Seismograph Network



#### Ads and reviews posted online



Author networks help discover latent trafficking rings



- 1 in 6 missing persons become sex trafficking victims [National Center for Missing & Exploited Children (NCMEC)]
- MMPW continuously monitors online prostitution ads for missing persons
  - Compares ad photos vs. missing person photos
  - Alerts when missing person emerges in online advert



#### NCMEC Photo

MEMEX Ad

### MMPW automatically discovered 4 missing persons searching 17M faces/day



### Prevalence estimation for sex trafficking







Image source: https://www.youtube.com/watch?v=v5ghK6yUJv4





Source youtube: https://www.youtube.com/watch?v=YoOaJclkSZg



## D<sup>3</sup>M: Data-Driven Discovery of Models



538 election model

٠

- NCAR arctic sea ice model
- N7 IED explosion predictor
- Manual process: 10-1000s of person-years
- Teams of experts required to develop the model ۰
- Automatically select problem-specific model primitives
  - Extend the library of modeling primitives
- Automatically compose complex models from primitives
- Facilitate user interaction with composed models ٠



• Prior state of the art: Google/Microsoft DNN



DNN – Deep neural network



## The compute problem

It takes this...



... to protect this





Required 7,000+ compute hours to beat humans

The scope that tracks this...



has 30 minutes of battery life



### Case study: deep neural networks









### Data are vectors and matrices





## Machine learning is projection





## Compute enables machine learning; partially











## Electronics Resurgence Initiative: Materials and Integration Thrust

Daniel S. Green

DARPA Program Manager

18 July 2017



## Motivating Materials for Beyond Moore's Law Scaling

A compute problem





What is a transistor: The World of Modern Electrons; Sam Sattel



Applied Physics: Feb 2012; Experimental realization of superconducting quantum interference devices with topological insulator junctions. M. Veldhorst et. al.

### ...and continue to present opportunities





...and allowed a faster, flexible mix of materials







Accelerating Materials Discovery



DARPA Grant: Metalorganic Chemical Vapour dDeposition (MOCVD) Growth Process ~1989



DARPA Wide Bandgap Semiconductor – Radio Frequency (WBGS-RF) Program 2000s Commercial Material: Gallium Nitride (GaN) 1990s







Sources: DARPA, HRL, Solid State Technology





Sources: HRL, Solid State Technology



- Focus on enabling Beyond Moore's Law Scaling
  - Not an RF component initiative
  - Not a Moore's Law Scaling Initiative
- Big Question:
  - Can we develop processes to integrate (and identify) new materials quickly?

| Accelerating Materials Discovery Panel |                 |                                     |
|----------------------------------------|-----------------|-------------------------------------|
|                                        | Stephen Bedell  | IBM T. J. Watson<br>Research Center |
|                                        | Joy Watanabe    | Intermolecular, Inc.                |
|                                        | Michael Kozicki | Arizona State                       |
|                                        | Subu Iyer       | UCLA                                |
|                                        | Joseph Geddes   | Photia Incorporated                 |



**Emerging Materials and Devices** 









| Technology                                  | MPWO           | MPW1                        | MPW2                        | MPW3                     | Future MPWs                 |
|---------------------------------------------|----------------|-----------------------------|-----------------------------|--------------------------|-----------------------------|
| CMOS                                        | IBM 65nm       | GF 45 nm                    | GF 45 nm                    | GF 45 nm                 | GF 45 nm                    |
|                                             | TF4 (2 metals) | TF4 (3 metals)              | TF4 (4 metals)              | TF4 (4 metals)           | TF4 (4 metals)              |
| InP HBT                                     |                | TF5 (3 metals)              | TF5 (4 metals)              | TF5 (4 metals)           | TF5 (4 metals)              |
| InP Varactor<br>Diode                       |                |                             |                             |                          | AD1                         |
|                                             | GaN20          | GaN20                       | GaN20                       | GaN20                    | GaN20                       |
| GaN HEMT                                    | T3 (HRL)       | T3 (HRL)                    | T3 (HRL)                    | T3 (HRL)                 | T3 (HRL)                    |
| GaAs HEMT                                   |                |                             |                             | Р3К6                     | P3K6                        |
| Passive<br>Components                       |                | PolyStrata<br>(Nuvotronics) | PolyStrata<br>(Nuvotronics) | PolyStrata (Nuvotronics) | PolyStrata<br>(Nuvotronics) |
| Base                                        | CMOS           | CMOS                        | CMOS                        | CMOS                     | CMOS                        |
| Substrate                                   |                |                             |                             | SiC Interposer (IWP5)    | SiC Interposer (IWP5)       |
| High-Q Polydown<br>Pasion<br>Cispat<br>CMOS |                |                             | In test                     | In fab                   |                             |

Sources: DARPA, Northrop Grumman



<u>Unconventional Processing of Signals for Intelligent Data Exploitation (UPSIDE) Program</u>



Video surveillance collection and analysis significantly exceed current embedded computing capability

#### Today: Digital Signal Processing

- Current approaches require compute-intensive, exact, sequential operations over all pixels to detect features, objects and tracks.
- Large images require Tera-Ops/sec



**Detected Salient Pixels** 

#### **Unconventional Analog Processing**

• UPSIDE replaces compute-intensive exact Boolean operations with probabilistic, best match for significant power efficiency





- Focus on enabling Beyond Moore's Law Scaling
  - Not a conventional logic / memory device initiative
- Big Question:
  - What are the NEW materials or devices (and their functions) that should added to the toolbox?

| Emerging Materials and Devices Panel |                     |                                       |  |  |  |
|--------------------------------------|---------------------|---------------------------------------|--|--|--|
|                                      | Jian-Ping Wang      | University of Minnesota               |  |  |  |
|                                      | Sayeef Salahuddin   | UC Berkeley/EECS                      |  |  |  |
|                                      | Arjit Raychowdhury  | Georgia Tech                          |  |  |  |
|                                      | Vladimir Stojanovic | University of California,<br>Berkeley |  |  |  |
|                                      | Noah Sturcken       | Ferric, Inc.                          |  |  |  |



**Integrated Processes** 



Gate Array

Second

### "Page 3" materials and integration





Gate Array

Second

GP - General Purpose

### "Page 3" materials and integration





Metal Oxide Semiconductor

Memory

CNN – Convolutional Neural Network

- Most of the problem is memory bandwidth and latency •
- Even 2D CMOS ML accelerators aren't addressing the memory problem •



Simulation data from S. Mitra at Stanford DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited.



### Potential solutions

"Bring memory in the compute" Monolithic 3D SoC



Initial simulations

- Up to 1000X improvement in Energy\*time for memory-intensive applications at a common node
- Up to 100X improvement in Energy\*time when comparing 3D SoC @ 90nm with 2D at 7nm
- Less cost per area than 2D 14nm fabrication with up to 4GB of on-chip memory storage

Critical needs

- Low temperature logic device fabrication (< 450C)
- Low temperature, dense NVM cell fabrication ( < 450C)

"Bring the compute in memory" DNN Dot Product calculation



Initial simulations

Initial simulation shows strong improvement to Energy\*time for DNN core computation

Critical needs

- Full system simulations
- Optimal memory unit cell



- Focus on enabling Beyond Moore's Law Scaling
  - Not just the 3DIC challenge with conventional architectures
  - Seek to overcome the memory bottleneck
- Big Question:
  - Can we use integrated process to realize new architectures unavailable today?

| Integrated Processes Panel |                  |                                     |  |  |  |
|----------------------------|------------------|-------------------------------------|--|--|--|
|                            | Max Shulaker     | МІТ                                 |  |  |  |
|                            | Bruce Taol       | Micron                              |  |  |  |
|                            | Wilfried Haensch | IBM T. J. Watson<br>Research Center |  |  |  |
|                            | Qiangfei Xia     | UMass Amherst                       |  |  |  |
|                            | Zvi Or-Bach      | MonolithIC 3D, Inc.                 |  |  |  |



### Electronics Resurgence Initiative: Architectures

#### **Tom Rondeau**

DARPA Program Manager





# Previous project lead for GNU Radio





Gate Array

Processor

### Streaming data across multiple processing elements





- Processor design trades
  - Math/logic resources
  - Memory (cache vs. register vs. shared)
  - Address computation
  - Data access and flow
- Processor choice depends on:
  - Memory requirements (small vs. large) x (random vs. linear)
  - Computation requirements



The problem: Can we find optimal hardware configuration across algorithms?





## Managing specialization & flexibility



#### **Specialization**

- Performance has come at the cost of usability
- Difficulty in programming and system integration

### **Flexibility**

- Productivity has come at the cost of compute efficiency
- Abstraction tends to ignore the underlying hardware





### It's not just the processor



- GPUs do better at computing convolutions (dense matrix multiplies)
- Cost of data transfer means sometimes the CPU is more efficient
- Resource optimization for multiple applications



### Today's model

Single Processor: Significant prior work

• High-level languages, compilers, libraries, tools

System of Processors: Basic tools but significant difficulties

• Middleware, busses/networking, data management

# **Opportunities**

- Full understanding of the processing elements
- Performance monitoring and online updates
- Managing data movement (memory, I/O)
- Better representations of the problems
- Faster time to integration



Build new compute engines and processors that solve the significant computing needs of today's and tomorrow's applications.



But a chip that can't be used, integrated, and programmed is called sand

This list of processors suggests that solutions exist. So why are we here?

### Parallel Processors

| Adapteva<br>Analog Devices-<br>BlackFin<br>Altair<br>Altera<br>Ambric<br>AMD-APU<br>ARM-MP/Neon<br>ARM-Mali<br>Asocs<br>Aspex<br>AxisSemi<br>BOPS<br>Boston Circuits<br>Brightscale<br>Calxeda<br>Cavium<br>CEVA<br>Chameleon | Cognivue<br>Cognovo<br>Coherent Logix<br>CoreSonic<br>CPUTech<br>Cradle<br>Cswitch<br>DesignArt<br>ElementCXI<br>EZChip<br>Freescale<br>Greenarrays<br>HP<br>IBM-Cell<br>IBM-Cell<br>IBM-Cyclopse<br>Icera-PowerVR<br>Imagination-PowerVR<br>Imec<br>Inmos-Transputer | Paneve<br>Picochip | Rapport<br>Raytheon-Monarch<br>Recore<br>Sandbridge<br>SiByte<br>SiCortex<br>Silicon Hive<br>Silicon Spice<br>Singular Computing<br>Sound Design<br>SpiralGateway<br>Stream Processors<br>Stretch<br>Tabula<br>Thinking Machines<br>TI<br>Tilera<br>TOPS<br>Venray | XMOS<br>Ziilabs |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|
| -                                                                                                                                                                                                                             |                                                                                                                                                                                                                                                                       |                    |                                                                                                                                                                                                                                                                    |                 |
| Cognimem                                                                                                                                                                                                                      | Intel-Larrabee                                                                                                                                                                                                                                                        | Quicksilver        | Xilinx                                                                                                                                                                                                                                                             |                 |
|                                                                                                                                                                                                                               |                                                                                                                                                                                                                                                                       |                    |                                                                                                                                                                                                                                                                    |                 |

http://www.adapteva.com/andreas-blog/the-siren-song-of-parallel-computing/



### Benefits of a rich development ecosystems



https://www.gitbook.com/book/tra38/essential-copying-and-pasting-from-stack-overflow/details



### Managing specialization & flexibility

- Are flexibility and specialization inherently opposite?
  - Eat your cake and have it, too
- New approaches to processor/SoC designs that change how we specialize?
  - Potential new accelerators and flexible processors that change to meet data needs?

### Building a Development Ecosystem

- How do we understand processing needs/capabilities?
  - Cataloged by the math (e.g., dense vs. sparse)?
- Are there better tools to manage the system of processors?
  - Intelligent agents, smart compilers, others?



https://en.wikipedia.org/wiki/Architec



https://commons.wikimedia.org/wiki/File:Chocolate\_Fondant.jpg

