

#### VLSI Technology Update

Dr. Douglas Sheldon JPL Parts Engineering

Jet Propulsion Laboratory, California Institute of Technology. Copyright 2012 California Institute of Technology. Government sponsorship acknowledged.

#### Overview



- The relentless scaling of integrated circuits continues unabated.
- Space community remains many generations behind, driven by radiation and reliability requirements and historical architecture and implementations.
- Performance and cost pressures on new space missions will drive infusion of highly scaled technologies in an ever increasing rate.
- Methodologies to evaluate assurance requirements for space applications must continue to evolve at the same pace in order to support projects.
- Test complexity has increased dramatically.
- Assurance for flight has become intimately interwoven with application and system specific requirements and performance.

#### Transistor dimensions continue to scale

- Improve *Performance*
- Reduce *Power*
- Reduce *Cost* per Transistor



M. Bohr, "Scaling Trends", ISSCC 2009 To be presented at the 3rd NASA Electronic Parts and Packaging (NEPP) Program Electronic Technology Workshop June 11-13, 2012, NASA GSFC, Greenbelt, MD.

# Space VLSI products remain generations behind state of the art

| Device    | >250nm            | 250nm            | 180/130nm         | 90nm    | 65nm |
|-----------|-------------------|------------------|-------------------|---------|------|
| FPGA      |                   | RTSX             | RTAX/Virtex II    | XQ4V    | V5QV |
| Processor | RAD6000<br>UT1750 | RAD750/UT69<br>9 |                   |         |      |
| SRAM      |                   | UT8Q512          | UT8R512K8         |         |      |
| SDRAM     |                   |                  |                   | EDS5104 |      |
| ASIC      | UT0.6u/<br>HX3000 | UT0.25           | UT130n/HX50<br>00 | UT90n   |      |

#### JPL NEPP Tasks – VLSI Related



| Reliability      | Radiation    | Packaging                                |
|------------------|--------------|------------------------------------------|
| Flash based FPGA | <65nm FPGA   | Non Hermetic, Flip Chip,<br>CGA packages |
| Flash memory     | Flash memory | Through Silicon Via                      |
| MRAM memory      | SOC devices  | Class Y Technology<br>Demonstrators      |
| 45nm reliability |              | CGA/LGA/HDI package & board              |
| 65nm FPGA        |              | 3D Xray BGA/CGA<br>Workmanship           |
| DDR2 reliability |              |                                          |

- Laying foundation for mission use of many processes/technologies that are much closer to state of the art.
- Packaging and reliability issues offer significant growth areas for qualification and validations for space applications.

#### **Historical Transistor Scaling**





#### Starting at 90nm node strained silicon processing is required. This provides increased drive currents making up for lack of gate oxide scaling

NMOS



SiN cap layer Tensile channel strain 65 nm Transistor



SiO<sub>2</sub> dielectric Polysilicon gate electrode PMOS



SiGe source-drain Compressive channel strain 45 nm HK+MG



Hafnium-based dielectric Metal gate electrode

June 11-13, 2012,

M. Bohr, "Scaling Trends", ISSCC 2009

#### Leakage and repeatability become major technology concerns



To be presented at the 3rd NASA Electronic Parts and Packaging (NEPP) Program Electronic Technology Workshop June 11-13, 2012, NASA GSFC, Greenbelt, MD.

T. Mak, "Is CMOS more reliable with scaling?" Intel 2008

#### Interconnect Scaling



- Transition to copper interconnection and multiple types of interdielectric oxides has occurred.
- Markedly different failure mechanisms with new materials.
- Additional testing and screening requirements.





http://www.design-reuse.com/news/14648/chipworks-inside-xilinx-virtex-5-fpga-65nm-twins.html

To be presented at the 3rd NASA Electronic Parts and Packaging (NEPP) Program Electronic Technology Workshop June 11-13, 2012, NASA GSFC, Greenbelt, MD.

### VLSI Technology Reliability Issue

- Time Dependent Dielectric Breakdown (TDDB)
- Hot Carrier Injection (HCI)
- Negative Bias Temperature Instability (NBTI)
- Electromigration (EM)
- Stress Migration (SM)
- ESD/Latchup
- Addressed at the wafer level in terms of acceleration and definition of failure.
- Results of these tests at the wafer level are turned into design rules and models that allow for worst case design.
- There is almost no margin on the parts to do accelerated testing on final parts.

## How Scaling Impacts SRAM cells



Scaling trend:

>

- > Increased gate leakage + degraded I<sub>ON</sub>/I<sub>OFF</sub> ratio
- Lower V<sub>DD</sub> during standby
- > PMOS load devices must compensate for leakage

- SRAM use is ubiquitous in all modern devices
- Degradation and ageing effects in SRAM can dominate failure mechanisms in this broad spectrum of devices.
- Transistors within the SRAM cells are typically amongst the smallest within a technology
- SRAM Static Noise Margin (SNM) is highly sensitive to device mismatch.
- The scaling of SRAM memory arrays has increased the sensitivity to NBTI-induced transistor Vt mismatch, which can degrade Vmin characteristics over time.

bwrc.eecs.berkeley.edu/classes/icdesign/ee241.../Lecture9-SRAM.pdf 10 EPP) Program Electronic Technology Workshop June 11-13, 2012, NASA GSEC, Greenbelt, MD.

#### Failure types in SRAMs



- Hard fails are being reduced as processes have reduced defect densities
- Soft (voltage sensitive) failures increase due to reduced margins
  - Failure to write the cell
  - Signal margin fails which are driven primarily by slow bits that are otherwise stable and have write margin
  - Read stability fails for cells in the read or half-selected state
  - Retention fails.
- Manufacturers have to add new read and write assist circuit designs and innovative redundancy schemes to address rperformance issues to conclusion workshop June 11-13, 2012, NASA GSEC, Greenbelt, MD.

#### **NBTI** Overview





- In the high state of the flip-flop (node BIT is high) T2 is under NBTI stress and in the other state (BIT low), T4 is stressed
- NBTI related lifetime degradation is strong function of application condition.
- NBTI due to Interface traps: R-D model (diffusion of molecular H2) can explain many observed features



T. Pana and C. Liub, *Electrochemical and Solid-State Letters*, **8** 12 G348-G351 2005

R. Wittman, "Impact of Random Bit Values on NBTI Lifetime of

To be presented at the 3rd NASA Electronic Parts and Packaging (NEPP) Program Electronic Packaging (NEP) Program Electr

#### How NBTI effects delay times



- Vendors have to improve design process:
  - Assume all PMOS devices will have NBTI
  - Model circuits with End of Life conditions
  - Adjust test bins to reflect worst case modeled degradation To be presented at the 3rd NASA Electronic Parts and Packaging (NEPP) Program Electronic Technology Workshop June 11-13, 2012, NASA GSFC, Greenbelt, MD.



- 2005 Xilinx app note mentions decrease in maximum operational DCM frequency if held in persistent static condition for extended time.
- Effect is possible loss of ability to achieve Lock up at maximum frequency specification
- Variety of solutions:
  - Null design for extended time applications
  - New macros to support longer duration of input clock stopping or reset hold times
  - Unused DCMs are automatically configured into continuous calibration mode.
  - Final solution is design fix

## Recent Commercial VLSI Failures

- January 2011 Intel Cougar Point SATA 3 Gb/s chipset (32nm CMOS)
- Intel's Steve Smith, vice president and director of PC client operations and enabling at Intel, says:
  - The specific problem occurs over time, and is affected by temperature and voltage.
  - More likely to manifest in configurations with lots of data being moved across the SATA 3 Gb/s ports
- Design issue that required a metal layer fix
- "In some cases, the Serial-ATA (SATA) ports within the chipsets may degrade over time, potentially impacting the performance or functionality of SATA-linked devices such as hard disk drives and DVD-drives."
- "Total cost to repair and replace affected materials and systems in the market is estimated to be *\$700 million*."
- Suspected cause NBTI effect

#### Recent VSLI Parts Issues at JPL



- SRAM
  - Temperature dependent bit failures that were a function of READ timing and data pattern.
- SDRAM
  - Single Bit Errors (non-radiation related) that required extended rewrite to repair.
  - Identical SEFI Signature failure on several missions.
- Processor
  - Invalid instructions/data generation

#### Themes

- Operational/Application/Technology related.
- Not manufacturing defect/screening issue.
- Long term reliability appears adequate so far.



- The bit line sensing voltage level of Data "0" is lower than that of Data "1" so that Data "0" is more susceptible to possible ground related voltage noise.
- Data "0" can have a higher possibility of failure than Data "1".

### Display of a SEFI



- Actual flight SEFI
- Only the SEFI rectangle is shown
- Left and right side show regions with two different underlying patterns
- Only addresses with 2-bit errors are shown
- Graphical/statistical analysis tools are required to determine patterns



S. Guertin, "Analysis of SDRAM SEFIs", 2012



### Testing Scaled VLSI parts

- Single greatest "critical path" for proper VLSI part assurance.
- All parts contain embedded memory, sensitive clock circuitry, sophisticated repair and redundancy schemes.
- Capital and human investment is substantial (>>\$1M).
- Recreating at NASA would have limited cost/benefit.
  - Auditing and close technical relationship required with vendor
- What should NASA be concerned about?
  - Application specific degradation
  - Having the ability to run specific tests to validate reliability and use conditions

# Example - Degradation Evaluation of FPGA



- Stress => Extra Delay => Failure
- General requirements:
  - Generate test vectors
  - Generate clocks and associated timing control
  - Analyze response and ability to compare to standard



To be presented at the 3rd NASA Electronic Parts and Packaging (NEPP) Program Electronic Technology Workshop June 11-13, 2012, NASA GSFC, Greenbelt, MD.

#### Delay Testing of Arbitrary Sub Circuit Blocks

- Technique to determine max operating frequency => worst case delay.
- Measure output transition probability as a function of clock frequency induced jitter failures to infer delay.



### NASA

#### Summary – Where to from here

- Space Vendors:
  - Pattern and timing sensitivity/interactions need to be tested *exhaustively* over temperature and voltage range, not merely sampled or optimized for test time/cost.
  - Vendors should maintain 'golden units' for customer support questions.
  - Vendors should invest in detailed reverse engineering to provide precise circuit information to user community
  - Often these parts are designed for end use systems that are different than space systems (noise, power, stability, etc.). Vendor application support needs to provide detailed technical briefs for practical customer implementation.
- What can NASA measure that is useful?
  - Statistical analysis of data sheet parameters of incoming samples.
  - Custom evaluation methodologies for reliability and radiation degradation
  - Spend lots of test time on the few parts on hand