

### Opportunities and Challenges for Photonics in Next Generation Data Centers

### **Clint Schow**

Department of Electrical and Computer Engineering The University of California Santa Barbara schow@ece.ucsb.edu



## Outline

- Background: Interconnects and Technologies
- Optics In Today's Data Centers
  - Point-to-point interconnects
  - Multiple technologies for multiple purposes
- Opportunities
  - Photonic I/O
  - Photonic routing and switching
- Path Forward: Large-Scale Electronic/Photonic Integration
- Closing Thoughts



### Historically Two Fiber Optics Camps: Datacom and Telecom



October 22, 2015

### UCSB Technologies for Data Center Interconnects

**VCSELS** (Vertical Cavity Surface Emitting Lasers) Multi-mode fiber (MMF), Polymer Waveguides, Multicore fiber









### **Si Photonics**

Single-mode fiber (SMF), Wavelength Division Multiplexing (WDM)



October 22, 2015

C. Schow, UCSB



## The Old Days: 2012



## Trickle-Down: HPC drives development of highest performance components that are later picked up by commercial servers

### UCSB The Promise of A Machine with 2,000,000 VCSELs



IBM, public presentation 2010

• A. Benner, "Optical Interconnect Opportunities in Supercomputers and High End Computing," OFC Tutorial 2012.

## UCSB Fiber Backplane Density Pushed to the Limit

#### IBM Power 775: Pushing the Limits



#### Need more BW/fiber: WDM, multicore fiber

luiticore liber

IBM, public presentation 2010

IBM

A. Benner, "Optical Interconnect Opportunities in Supercomputers and High End Computing," OFC Tutorial 2012.
 October 22, 2015
 C. Schow, UCSB

## UCSB Optics in HPC: IBM Sequoia

96 IBM Blue Gene/Q Racks

#### 20.013 Pflops Peak ... 1.572M Compute Cores ... ~8MW ... 2026 Mflops/Watt



• HPC requires technologies optimized for short reach ~50m



### VCSELs: Efficient and Fast



• J. E. Proesel et al., "35-Gb/s VCSEL-based optical link using 32-nm SOI CMOS circuits," OFC 2013, Paper OM2H2, Mar. 2013.

October 22, 2015

### UCSB Rethinking Equalization: Optimizing the Performance of Complete Links



#### Next opportunity: Si photonic WDM links

A.V. Rylyakov et al., "Transmitter Pre-Distortion for Simultaneous Improvements in Bit-Rate, Sensitivity, Jitter, and Power Efficiency in 20 Gb/s CMOS-driven VCSEL Links, JLT 2012.

- A. V. Rylyakov et al., "A 40-Gb/s, 850-nm VCSEL-based full optical link," OFC 2012.
- D. Kuchta et al., "A 71-Gb/s NRZ Modulated 850-nm VCSEL-Based Optical Link," PTL, March 2015.

## UCSB And Then The Cloud Rolled In



View of Goleta Beach from UCSB, source www.theinertia.com



### Growth in Cloud Data Centers

Figure 3. Total Data Center Traffic Growth



#### Cloud is the growth market

Source: Cisco Global Cloud Index, 2013-2018

#### Figure 2. Global Data Center Traffic by Destination



#### Source: Cisco Global Cloud Index, 2013-2018

#### 75% of traffic is within the data center

## UCSB Huge Growth in the Cloud (Microsoft)



Source: D. Maltz, Microsoft, OFC 2014

## **UCSB** Traditional Data Center Network Hierarchy

Traditional hierarchical networks grew out of campus/WAN installations

Good for North-South traffic



See L. A. Barroso and U. Hölzle, "The Datacenter as a Computer-An Introduction to the Design of Warehouse-Scale Machines,"



### Data Center Hardware



Photos of Facebook data centers found on Google images

### Cloud Data Centers (Microsoft)



October 22, 2015

UCSB

## Largest Data Centers, 2015

Source: Computer Business Review (cbronline.com)

**1. Range International Information Group** (Langfang, China) Area: 6,300,000 Sq. Ft.

**2. Switch SuperNAP** (Nevada, USA) Area: 3,500,000 million Sq. Ft.

**3. DuPont Fabros Technology** (Virginia, USA) Area: 1,600,000 million Sq. Ft.

**4. Utah Data Centre** (Utah, USA) Area: 1,500,000 million Sq. Ft.

**5. Microsoft Data Centre** (lowa, USA) Area: 1,200,000 Sq. Ft.

**6. Lakeside Technology Centre** (Chicago, USA) Area: 1,100,000 Sq. Ft.

**7. Tulip Data Centre** (Bangalore, India) Area: 1,000,000 Sq. Ft.

8. QTS Metro Data Centre (Atlanta, USA) Area: 990,000 Sq. Ft.

**9. Next Generation Data Europe** (Wales, UK) Area: 750,000 Sq. Ft.

**10. NAP of the Americas** (Miami, USA) Area: 750,000 Sq. Ft.

October 22, 2015





#### Huge facilities need lots of longer-distance links: 2km becoming the magic number for data centers

## UCSB Toward More Connected Data Centers

| Number<br>of ports | #Nodes<br>Connected | Connected   |            |            |
|--------------------|---------------------|-------------|------------|------------|
| per switch         | Two Level           | Three Level | Four Level | Five Level |
| 8                  | 32                  | 128         | 512        | 2048       |
| 16                 | 128                 | 1024        | 8192       | 65536      |
| 32                 | 512                 | 8192        | 131072     | 2.10E+06   |
| 64                 | 2048                | 65536       | 2.10E+06   | 6.71E+07   |
| 128                | 8192                | 524288      | 3.36E+07   | 2.15E+09   |
| 256                | 32768               | 4.19E+06    | 5.37E+08   | 6.87E+10   |
| 512                | 131072              | 3.36E+07    | 8.59E+09   | 2.20E+12   |
| 1024               | 524288              | 2.68E+08    | 1.37E+11   | 7.04E+13   |



### 1) Flatten the Network:

Higher radix (larger port count) switches Electrical or Optical Cores

### 2) Change the Network:

Photonic switching



## Photonic I/O: Optics to the Chip

### UCSB Shared Visions: Photonically Connected Chips

#### **Today: Electrical Chip Packaging**



- IC packages with course BGA/LGA electrical connectors
- Poor scalability, signal integrity
- Reduced system performance
- Reduced system efficiency



#### **Future: Photonic Packages**



#### Photonic integration must provide more I/O bandwidth at better efficiency

## **UCSB** Limitations of Electrical Switches

| Vellanox<br>Switch IB | MA Midlaner<br>Transac-reat<br>SweetIB                | 1<br>1<br>1<br>1<br>1<br>1<br>1<br>1<br>1<br>1<br>1<br>1<br>1<br>1 |
|-----------------------|-------------------------------------------------------|--------------------------------------------------------------------|
| Ordering Part Number  | Description                                           | Typical Power                                                      |
| MT52236A0-FDCR-E      | Switch-IB, 36 Port EDR InfiniBand Switch IC           | 83W                                                                |
| Ordering Part Number  | Description                                           | Typical Power                                                      |
| MT52132A0-FCCR-C      | Spectrum, 32 Port Ethernet 100GbE Switch IC (RoHS R6) | 135W                                                               |
|                       |                                                       | Source: mellanox.com                                               |

### Broadcom



Source: broadcom.com

### Tomahawk

128 x 25Gb/s SerDes = 32 100GbE ports

7 Billion transistors

## Two primary limitations

- Power (mostly electrical I/O)
- Density

## UCSB Density Limits Imposed by Packages, Not ICs



### UCSB Higher BW Density:

Demands Integration at (First-Level) Chip Package

#### **Cross-sectional view of chip module on board**



• D. Kam et al., "Is 25 Gb/s On-Board Signaling Viable?, IEEE Adv. Packag., 2009



### Electrical I/O: Burning Power to Overcome Packaging Limitations

### Electrical Link Example: Backplane



Courtesy A. Rylyakov, OFC Short Course #357

## UCSB Integration to Maximize BW and Efficiency



# UCSB From Discrete Modules to Integrated Chip I/O (Luxtera)

### **High-Speed Optical Interconnect Evolution II**

| CONTEMPORARY – Today                                                                                                                                                                                                                                                                                  | EMERGING - 2014/15                                                                                                                                                                                                                                                                                                                                      | STRATEGIC DIRECTION - 2018+                                                                                                                                                                                                                                                                                                                                                             |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|                                                                                                                                                                                                                                                                                                       |                                                                                                                                                                                                                                                                                                                                                         |                                                                                                                                                                                                                                                                                                                                                                                         |
| Traditional MSA compliant<br>pluggable modules and AOCs on<br>card edge<br>Considerable SI issues (electrical<br>connectors, long traces on host<br>PCBA) require re-timers.<br>Front panel interconnect density<br>limited by module size (physical<br>implementation + module power<br>dissipation) | <ul> <li>Embedded optical transceivers<br/>located closer around ASIC</li> <li>Shorter traces on PCB alleviate SI<br/>issues</li> <li>Optical fibers bring IOs to optical<br/>connectors on front panel</li> <li>Front panel interconnect density<br/>limited by size optical connectors</li> <li>Very high reliability/quality<br/>required</li> </ul> | <ul> <li>Optical transceivers co-packaged<br/>w/ ASIC</li> <li>Minimized electrical interconnect<br/>eliminates SI issues</li> <li>Optical fibers bring IOs to optical<br/>connectors on front panel</li> <li>Lowest system power dissipation</li> <li>Highest front panel density and<br/>smallest potential system form<br/>factor</li> <li>Very high reliability required</li> </ul> |
| Switch ASIC Re-timer Optical Module                                                                                                                                                                                                                                                                   | Switch ASIC optical module Fiber                                                                                                                                                                                                                                                                                                                        | Switch ASIC w/ photonics Fiber                                                                                                                                                                                                                                                                                                                                                          |



### Highly-Integrated Photonic I/O to Improve Power Efficiency (Luxtera)

### **Reducing system power dissipation by integration**



C. Schow, UCSB

## UCSB Toward All-Photonic WDM I/O (Oracle)



### UCSB Hybrid Integration Platform for Large-Scale Photonic I/O



Courtesy of A. Krishnamoorthy, Oracle

October 22, 2015

## UCSB 32 nm SOI CMOS-Driven Link (Aurrion/IBM)

Current practice: Low integration level, inefficient 50  $\Omega$  interfaces



#### Maximum efficiency: directly drive the EAM



Aurrion's heterogeneous integration platform for photonics

**30 Gb/s** TX out



RX out



- Adds III-V functionality to Si Photonics
- 3 pJ/bit at 30 Gb/s (not including laser power)

#### • No measured penalty for 10km transmission at 25 Gb/s

Integration of lasers, modulators, & detectors on the same wafer

• N. Dupuis et al., "30Gbps Optical Link Utilizing Heterogeneously Integrated III-V/Si Photonics and CMOS Circuits," OFC 2014 (post deadline).

• N. Dupuis et al., "30Gbps Optical Link Combining Heterogeneously Integrated III-V/Si Photonics with 32nm CMOS Circuits," JLT, 2015.

October 22, 2015

SOI

## UCSB Circuit/Photonics/Packaging Co-Design

### Electronic and photonic chip integration



#### Demonstrated in hardware





• B. G. Lee et al., "A WDM-Compatible 4 × 32-Gb/s CMOS-Driven Electro-Absorption Modulator Array," OFC 2015.

### Wall-Plug WDM links (Aurrion/IBM)







Low modulation power → directly driving EAM, no 50Ω interfaces
 A. Ramaswamy *et al.*, "A WDM 4x28Gbps Integrated Silicon Photonic Transmitter driven by 32nm CMOS driver ICs," *OFC 2015* (post deadline).

October 22, 2015

**UCSB** 



## **Photonic Switching**



### What do we mean by "fast" optical switching?







#### ms-scale

Mice flows over packet switch, elephant flows over OCS **Promise:** 

- Lower cost/power, fewer cables, software control (SDN)

#### **Challenges:**

- Scalability of software scheduler
- Slow reconfiguration time may limit applications

#### μ<mark>s-scale</mark>

OCS at first switch level, hybrid but much faster then 3D-MEMS **Promise:** 

- Reconfigure network at flow-level, based on workloads
- Hardware control (FPGA).

#### **Challenges:**

- Custom NICs
- Scalability: switch radix

#### ns-scale

All-optical switching, electronic buffering at end points **Promise:** 

- Switching times ~ packet durations
- More power-efficient than electrical networks of equal BW **Challenges:**

#### - Switch hardware and fast synchronizing links

- Scalability: switch radix: losses, fast control plane, flow control

34



### Hybrid Networks

#### Hybrid Electrical/Optical Networks Electrical Packet-switched Optical-circuit switched Network (OCS) w/o transcievers network (EPS) w/transcievers Electrical cross-connect Optical cross-connect **EPS Optical/Electrical** Interface (transciever) OCS Interface Top of (higher rate than EPS) Rack Switches Optical port Hosts in Rack (no transciever)

Circuit switching

- Decouple line rate from speed of control plane
- Used for persistent high-data rate traffic must be scheduled

#### Packet switching

- Handles 'tail' of traffic demand
- Can correct for errors in the circuit schedule



Slide #64

Courtesy of Prof. G. Papen and G. Porter, UCSD



## Calient: 3-D MEMS OCS







#### Figure 4: Hybrid Packet-OCS Datacenter Network Architecture

"The Software Defined Hybrid Packet Optical Datacenter Network" whitepaper available at www.calient.net

#### High port count (320), low insertion loss, low crosstalk, <50ms reconfiguration



### Photonic Switches in Data Centers

### UCSD Hybrid Networking Research



Courtesy of Prof. G. Papen and G. Porter, UCSD



ToR photonic switches:

- Fast reconfiguration (dynamic traffic)
- High-radix
- Low cost
- N. Farrington et al., "Helios: a hybrid electrical/optical switch architecture for modular data centers." SIGCOMM, 2011.
- R. Aguinaldo et al., "Energy-efficient, digitally-driven "fat pipe" silicon photonic circuit switch in the UCSD MORDIA data-center network." CLEO 2104.
- H. Liu et al., "REACTOR: A reconfigurable packet and circuit ToR switch," Photonics Society Summer Topicals, 2013.

## Switch/Driver Integration (IBM)

Monolithically integrated switch +driver chip (IBM 90nm photonics-enabled CMOS)



#### Integrated digital switch drivers



#### Fast reconfiguration: 4 ns



#### **Broad spectral bandwidth:** Routing many wavelength channels

- <-20dB crosstalk over 60 nm BW
- 32 channels at 200GHz spacing



## Losses too high, need significant feedback and control to manage crosstalk High level of electronic/photonic integration demanded

- N. Dupuis et al., "Modeling and Characterization of a Non-Blocking 4 × 4 Mach-Zehnder Silicon Photonic Switch Fabric," JLT 2015.
- N. Dupuis et al., "Design and Fabrication of Low-Insertion-Loss and Low-Crosstalk Broadband 2 × 2 Mach–Zehnder Silicon Photonic Switches," JLT 2015.
- B. G. Lee et al., "Monolithic Silicon Integration of Scaled Photonic Switch Fabrics, CMOS Logic, and Device Driver Circuits," JLT 2014.
- B. G. Lee et al., "Silicon Photonic Switch Fabrics in Computer Communications Systems," JLT 2015.

October 22, 2015

UCSB

## UCSB High Port Count MEMS Switches (UC Berkeley)





October 22, 2015

# UCSB Wavelength Routing: AWGR







Courtesy of Professor S. J. Ben Yoo, UC Davis

## Integrating Optical Gain for Scalability



### Hybrid Approach (IBM):



R. A. Budd et al., "Semiconductor Optical Amplifier (SOA) Packaging for Scalable and Gain-Integrated Silicon Photonic Switching Platforms," ECTC 2015. ٠ •

L. Schares et al., "Etched-Facet Semiconductor Optical Amplifiers for Gain-Integrated Photonic Switch Fabrics," ECOC 2015.

UCSB



### **Challenging Packaging**



• Precise alignment required for each SOA chip

R. A. Budd et al., "Semiconductor Optical Amplifier (SOA) Packaging for Scalable and Gain-Integrated Silicon Photonic Switching Platforms," ECTC 2015.

October 22, 2015



## New Electronic Capabilities Needed for Photonic Switching



- Optical circuits configured through a switch or fabric to connect TX<sub>i</sub> with RX<sub>j</sub>
- Switching time defines context of usage
- But, switching time is not all hardware reconfiguration



### Microsecond-scale



- e.g. Finisar: LCOS [OFC 2006, OTuF2]
- Hybrid networks
- Reconfigure at flow-level
- Hardware control (FPGA)
- Scalable (10's to 100 ports)



Courtesy of B. Lee, IBM



October 22, 2015

## UCSB Complete Burst-Mode Receiver (31ns lock, 4 pJ/bit)

#### Measured Dynamics of Burst-Mode Receiver



IBM

32nm SOI CMOS (1.3 mm × 0.9 mm)



Fast photonic routing and switching fabrics must have companion electronics

#### • Needs more research in the community Enables fine-grained power management

• A. R. Rylyakov et al., "A 25 Gb/s Burst-Mode Receiver for Low Latency Photonic Switch Networks," OFC 2015.

• A. R. Rylyakov et al., "A 25 Gb/s Burst-Mode Receiver for Low Latency Photonic Switch Networks," JSSC 2015.

October 22, 2015



# Path Forward: Large-Scale Electronic/Photonic Integration

## UCSB Si Photonics to Enable Next Gen Data Centers



**Si Photonics:** integrating photonic devices into Si platforms to leverage the huge Si electronics manufacturing infrastructure

- Very large scale integration
- Tight process control
- Wafer-level testing
- High-yield and low cost

### Advantages for the data center:

- Single-mode → multi-kilometer links
- Wavelength-division multiplexing (WDM)  $\rightarrow$  high BW/fiber
- High integration level → many devices needed for switching and high-speed interconnects



October 22, 2015

### UCSB Need for Many Functions: Heterogeneous Integration of 6 Photonic Platforms

GaAs





LiNbO<sub>3</sub>



InP



SiN/SiON/SiO2





Heck et al. JSTQE 2013 C. Schow, UCSB

Slide courtesy of Prof. J. Bowers, UCSB 49

October 22, 2015



### Large Scale Integration for Switching

### Large-scale photonic integration

- Switching
- Amplification
- Control



Hybrid III-V on Si SOAs Demonstrated in Multiple Platforms



1. H. Park et al., "A Hybrid AlGaInAs-Silicon Evanescent Amplifier, PTL 2007.

2. S. Keyvaninia et al., "A Highly Efficient Electrically Pumped Optical Amplifier Integrated on a SOI Waveguide Circuit," Group IV Photonics 2012.

3. G.-H. Duan et al., "New Advances on Heterogeneous Integration of III-V on Silicon," JLT 2015.

-50 mÅ

60 mA

1570

## UCSB Large Scale Integration for Photonic I/O

#### Photonics must deliver system advantages

- More I/O bandwidth at less power for processors
- Larger port switches to enable flatter networks
- Higher Integration levels: processor, memory, network



### Hundreds of photonic interfaces: all off-module high-speed I/O

- High BW/interface → WDM
- Low cost → compatibility with high-volume electronics manufacturing
- Low loss  $\rightarrow$  maximize link power efficiency by minimizing required laser power

### Thousands of electrical interfaces: connecting electronics to photonics

- High density, high-speed, low-power
- New functions: rapid synchronization for low latency switching, power management
- Specialized chip I/O co-designed with and only for photonics, no general purpose cores

Reliability: components either don't fail or can be spared (also for yield)

- Large-scale integration is fundamentally required
- Holistic design of photonics, electronics, packaging and assembly



### Potential: More highly connected systems

### **Photonic Switching**

• Routing 10's of Tb/s at sub pJ/bit power

### Photonic I/O

- Higher radix electrical switches, flatter networks
- More processor/memory bandwidth
- Multiple photonic technologies for multiple purposes
  - VCSEL, PSM, WDM point-to-point, WDM switched





### **Challenge: Integrating large-scale electronics with large-scale photonics**

### Re-thinking manufacturability and supply chain

- Moving from electronic to photonic-centric packaging
- Electronics/Photonics/Package co-design required
- What are the highest-value components and what assembly flow makes sense
- Who does what?



# Thank You!

### schow@ece.ucsb.edu

### Acknowledgements

IBM Research Colleagues: C. Baks, R. Budd, F. Doany, N. Dupuis, D. Kuchta, B. Lee, J. Proesel, P. Pepeljugoski, A. Rylyakov, L. Schares, M. Taubenblatt, and many others...

### <u>Slides:</u>

M. Tan (HP), P. De Dobbelaere (Luxtera), D. Maltz (Microsoft),

M. Laor (Compass Networks), A. Krishnamoorthy (Oracle), G. Papen (UCSD),

M. Wu (UC Berkeley), S. J. B. Yoo, (UC Davis), J. Bowers (UCSB)