#### **Lecture 03: ISA Introduction**

**CSE 564 Computer Architecture Summer 2017** 

Department of Computer Science and Engineering Yonghong Yan <u>yan@oakland.edu</u> www.secs.oakland.edu/~yan

#### Contents

- **1.** Introduction
- **2.** Classifying Instruction Set Architectures
- 3. Memory Addressing
- 4. Type and Size of Operands
- **5.** Operations in the Instruction Set
- 6. Instructions for Control Flow
- 7. Encoding an Instruction Set
- 8. Crosscutting Issues: The Role of Compilers
- 9. RISC-V ISA
- Supplement
  - MIPS ISA
  - RISC vs CISC
  - Compiler compilation stages
  - ISA Historical
    - Appendix L
  - Comparison of ISA
    - Appendix K

## The MIPS Instruction Set

- Used as the example as introduction
- Stanford MIPS commercialized by MIPS Technologies (<u>www.mips.com</u>)
- Large share of embedded core market
  - Applications in consumer electronics, network/storage equipment, cameras, printers, ...
  - Closed one to RISC-V
- Typical of many modern ISAs
  - See MIPS Reference Data tear-out card, and Appendixes
     B and E in the reference textbook

## **Arithmetic Operations**

- Add and subtract, three operands
  - Two sources and one destination

add a, b, c # a gets b + c

- All arithmetic operations have this form
- Design Principle 1: Simplicity favours regularity
  - Regularity makes implementation simpler
  - Simplicity enables higher performance at lower cost

### **Arithmetic Example**

• C code:

$$f = (g + h) - (i + j);$$

• Compiled MIPS code:

add t0, g, h # temp t0 = g + h add t1, i, j # temp t1 = i + j sub f, t0, t1 # f = t0 - t1

## **Register Operands**

- Arithmetic instructions use register operands
- MIPS has a 32 × 32-bit register file
  - Use for frequently accessed data
  - Numbered 0 to 31
  - 32-bit data called a "word"
- Assembler names
  - \$t0, \$t1, ..., \$t9 for temporary values
  - \$s0, \$s1, ..., \$s7 for saved variables
- Design Principle 2: Smaller is faster
  - c.f. main memory: millions of locations

## **Register Operand Example**

• C code:

$$f = (g + h) - (i + j);$$
  
- f, ..., j in \$s0, ..., \$s4

Compiled MIPS code:
 add \$t0, \$s1, \$s2
 add \$t1, \$s3, \$s4
 sub \$s0, \$t0, \$t1

# **Memory Operands**

- Main memory used for composite data

   Arrays, structures, dynamic data
- To apply arithmetic operations
  - Load values from memory into registers
  - Store result from register to memory
- Memory is byte addressed
  - Each address identifies an 8-bit byte
- Words are aligned in memory
  - Address must be a multiple of 4
- MIPS is Big Endian
  - Most-significant byte at least address of a word
  - c.f. Little Endian: least-significant byte at least address

## Memory Operand Example 1

- C code:
  - g = h + A[8];
  - g in \$s1, h in \$s2, **base address** of A in \$s3
- Compiled MIPS code:
  - Index 8 requires offset of 32
    - 4 bytes per word

## Memory Operand Example 2

• C code:

$$A[12] = h + A[8];$$

- h in \$s2, base address of A in \$s3
- Compiled MIPS code:
  - Index 8 requires offset of 32

### **Registers vs. Memory**

- Registers are faster to access than memory
- Operating on memory data requires loads and stores
   More instructions to be executed
- Compiler must use registers for variables as much as possible
  - Only **spill** to memory for less frequently used variables
  - Register optimization is important!

## **Immediate Operands**

- Constant data specified in an instruction addi \$s3, \$s3, 4
- No subtract immediate instruction
  - Just use a negative constant addi \$s2, \$s1, -1
- Design Principle 3: Make the common case fast
  - Small constants are common
  - Immediate operand avoids a load instruction

## The Constant Zero

- MIPS/RISC-V register 0 (\$zero) is the constant 0
  - Cannot be overwritten
- Useful for common operations
  - E.g., move between registers add \$t2, \$s1, \$zero

## **Representing Instructions**

- Instructions are encoded in binary
  - Called machine code
- MIPS instructions
  - Encoded as 32-bit instruction words
  - Small number of formats encoding operation code (opcode), register numbers, …
  - Regularity!
- Register numbers
  - \$t0 \$t7 are reg's 8 15
  - \$t8 \$t9 are reg's 24 25
  - \$s0 \$s7 are reg's 16 23

## **MIPS R-format Instructions**

| ор     | rs     | rt     | rd     | shamt  | funct  |
|--------|--------|--------|--------|--------|--------|
| 6 bits | 5 bits | 5 bits | 5 bits | 5 bits | 6 bits |

- Instruction fields
  - op: operation code (opcode)
  - rs: first source register number
  - rt: second source register number
  - rd: destination register number
  - shamt: shift amount (00000 for now)
  - funct: function code (extends opcode)

#### **R-format Example**

| ор     | rs     | rt     | rd     | shamt  | funct  |
|--------|--------|--------|--------|--------|--------|
| 6 bits | 5 bits | 5 bits | 5 bits | 5 bits | 6 bits |

add \$t0, \$s1, \$s2

| special | \$s1  | \$s2  | \$tO  | 0     | add    |
|---------|-------|-------|-------|-------|--------|
| 0       | 17    | 18    | 8     | 0     | 32     |
| 000000  | 10001 | 10010 | 01000 | 00000 | 100000 |

 $0000010001100100100000010000_2 = 02324020_{16}$ 

### Hexadecimal

- Base 16
  - Compact representation of bit strings
  - 4 bits per hex digit

| 0 | 0000 | 4 | 0100 | 8 | 1000 | С | 1100 |
|---|------|---|------|---|------|---|------|
| 1 | 0001 | 5 | 0101 | 9 | 1001 | d | 1101 |
| 2 | 0010 | 6 | 0110 | а | 1010 | e | 1110 |
| 3 | 0011 | 7 | 0111 | b | 1011 | f | 1111 |

- Example: eca8 6420
  - 1110 1100 1010 1000 0110 0100 0010 0000

## **MIPS I-format Instructions**

| ор     | rs     | rt     | constant or address |
|--------|--------|--------|---------------------|
| 6 bits | 5 bits | 5 bits | 16 bits             |

- Immediate arithmetic and load/store instructions
  - rt: destination or source register number
  - Constant:  $-2^{15}$  to  $+2^{15}$  1
  - Address: offset added to base address in rs
- Design Principle 4: Good design demands good compromises
  - Different formats complicate decoding, but allow 32-bit instructions uniformly
  - Keep formats as similar as possible

## **Stored Program Computers**



- Instructions represented in binary, just like data
- Instructions and data stored in memory
- Programs can operate on programs
  - e.g., compilers, linkers, ...
- Binary compatibility allows compiled programs to work on different computers
  - Standardized ISAs

# **Logical Operations**

Instructions for bitwise manipulation

| Operation   | С  | Java | MIPS      |
|-------------|----|------|-----------|
| Shift left  | << | <<   | s11       |
| Shift right | >> | >>>  | srl       |
| Bitwise AND | &  | &    | and, andi |
| Bitwise OR  |    |      | or, ori   |
| Bitwise NOT | ~  | ~    | nor       |

 Useful for extracting and inserting groups of bits in a word

## **Shift Operations**

| ор     | rs     | rt     | rd     | shamt  | funct  |
|--------|--------|--------|--------|--------|--------|
| 6 bits | 5 bits | 5 bits | 5 bits | 5 bits | 6 bits |

- shamt: how many positions to shift
- Shift left logical
  - Shift left and fill with 0 bits
  - s11 by *i* bits multiplies by  $2^i$
- Shift right logical
  - Shift right and fill with 0 bits
  - -srl by i bits divides by  $2^i$  (unsigned only)

### **AND Operations**

Useful to mask bits in a word
 Select some bits, clear others to 0

and \$t0, \$t1, \$t2

| \$t2 | 0000 0000 0000 0000 00 <mark>00 11</mark> 01 1100 0000 |
|------|--------------------------------------------------------|
| \$t1 | 0000 0000 0000 0000 00 11 11 00 0000 0000              |
| \$t0 | 0000 0000 0000 0000 0000 11 00 0000 0000               |

## **OR Operations**

Useful to include bits in a word
– Set some bits to 1, leave others unchanged

or \$t0, \$t1, \$t2

| \$t2 | 0000 0000 0000 0000 0000 11 | 01 1100 0000 |
|------|-----------------------------|--------------|
| \$t1 | 0000 0000 0000 0000 0011 11 | 00 0000 0000 |
| \$t0 | 0000 0000 0000 0000 0011 11 | 01 1100 0000 |

## **NOT Operations**

- Useful to invert bits in a word
  - Change 0 to 1, and 1 to 0
- MIPS has NOR 3-operand instruction
  - a NOR b == NOT ( a OR b )



# **Conditional Operations**

- Branch to a labeled instruction if a condition is true
   Otherwise, continue sequentially
- beq rs, rt, L1

- if (rs == rt) branch to instruction labeled L1;

- bne rs, rt, L1
  - if (rs != rt) branch to instruction labeled L1;
- j L1
  - unconditional jump to instruction labeled L1

## **Compiling If Statements**

• C code:

- f, g, ... in \$s0, \$s1, ...
- Compiled MIPS code:





## **Compiling Loop Statements**

• C code:

while (save[i] == k) i += 1;

- i in \$s3, k in \$s5, address of save in \$s6
- Compiled MIPS code:

## **More Conditional Operations**

- Set result to 1 if a condition is true
  - Otherwise, set to 0
- slt rd, rs, rt
  - if (rs < rt) rd = 1; else rd = 0;</pre>
- slti rt, rs, constant
  - if (rs < constant) rt = 1; else rt = 0;
- Use in combination with beq, bne

slt \$t0, \$s1, \$s2 # if (\$s1 < \$s2) bne \$t0, \$zero, L # branch to L

## **Branch Addressing**

- Branch instructions specify
  - Opcode, two registers, target address
- Most branch targets are near branch
  - Forward or backward

| ор     | rs     | rt     | constant or address |
|--------|--------|--------|---------------------|
| 6 bits | 5 bits | 5 bits | 16 bits             |

- PC-relative addressing
  - Target address = PC + offset × 4
  - PC already incremented by 4 by this time

# **Jump Addressing**

- Jump (j and jal) targets could be anywhere in text segment
  - Encode full address in instruction

| ор     | address |
|--------|---------|
| 6 bits | 26 bits |

- (Pseudo)Direct jump addressing
  - Target address = PC<sub>31...28</sub> : (address × 4)

## **Target Addressing Example**

Loop code from earlier example
 Assume Loop at location 80000



#### **Addressing Mode Summary**

#### 1. Immediate addressing

| op rs rt Immediate |  |
|--------------------|--|
|--------------------|--|

#### 2. Register addressing



#### 3. Base addressing



#### 4. PC-relative addressing



5. Pseudodirect addressing



# C Sort Example

- Illustrates use of assembly instructions for a C bubble sort function
- Swap procedure (leaf)
   void swap(int v[], int k)
   {
   int temp;
   temp = v[k];
   v[k] = v[k+1];
   v[k+1] = temp;
   }
   v in \$a0, k in \$a1, temp in \$t0

#### **The Procedure Swap**

```
void swap(int v[], int k)
{
    int temp;
    temp = v[k];
    v[k] = v[k+1];
    v[k+1] = temp;
}
```

## The Sort Procedure in C

```
    Non-leaf (calls swap)

    void sort (int v[], int n)
    {
      int i, j;
      for (i = 0; i < n; i += 1) {
         for (j = i - 1;
    j >= 0 && v[j] > v[j + 1];
    j -= 1) {
            swap(v,j);
         }
      }
 – v in $a0, k in $a1, i in $s0, j in $s1
```

#### **The Procedure Body**

| for1tst: | move<br>move<br>slt<br>beq           | \$s3,<br>\$s0,<br>\$t0,<br>\$t0,                                   | \$a1<br>\$zero<br>\$s0, \$s3      | #<br>#<br># | <pre>save \$a0 into \$s2<br/>save \$a1 into \$s3<br/>i = 0<br/>\$t0 = 0 if \$s0 <math>\geq</math> \$s3 (i <math>\geq</math> n)<br/>go to exit1 if \$s0 <math>\geq</math> \$s3 (i <math>\geq</math> n)<br/>j = i - 1</pre> | Move<br>params<br>Outer loop           |
|----------|--------------------------------------|--------------------------------------------------------------------|-----------------------------------|-------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|
| for2tst: | bne<br>sll<br>add<br>lw<br>lw<br>slt | <pre>\$t0,<br/>\$t1,<br/>\$t2,<br/>\$t3,<br/>\$t4,<br/>\$t0,</pre> | 0(\$t2)<br>4(\$t2)<br>\$t4, \$t3  | # # # # #   | <pre>\$t0 = 1 if \$s1 &lt; 0 (j &lt; 0) go to exit2 if \$s1 &lt; 0 (j &lt; 0) \$t1 = j * 4 \$t2 = v + (j * 4) \$t3 = v[j] \$t4 = v[j + 1] \$t0 = 0 if \$t4 ≥ \$t3 sector exit2 if \$t4 &gt; \$t3</pre>                    | Inner loop                             |
| exit2:   | move<br>move<br>jal<br>addi<br>j     | <pre>\$a0, \$a1, swap \$s1, for2t</pre>                            | \$s1<br>\$s1, -1<br>st<br>\$s0, 1 | # # # # # # | go to exit2 if $t4 \ge t3$<br>1st param of swap is v (old \$a0)<br>2nd param of swap is j<br>call swap procedure<br>j -= 1<br>jump to test of inner loop<br>i += 1<br>jump to test of outer loop                          | Pass<br>params<br>& call<br>Inner loop |

Outer loop

#### **The Full Procedure**

addi \$sp,\$sp, -20 # make room on stack for 5 registers sort: sw \$ra, 16(\$sp) # save \$ra on stack sw \$s3,12(\$sp) # save \$s3 on stack sw \$s2, 8(\$sp) # save \$s2 on stack sw \$s1, 4(\$sp) # save \$s1 on stack sw \$s0, 0(\$sp) # save \$s0 on stack # procedure body ... ... exit1: lw \$s0, 0(\$sp) # restore \$s0 from stack # restore \$s1 from stack lw \$s1, 4(\$sp) lw \$s2, 8(\$sp) # restore \$s2 from stack lw \$s3,12(\$sp) # restore \$s3 from stack lw \$ra,16(\$sp) # restore \$ra from stack addi \$sp,\$sp, 20 # restore stack pointer jr \$ra # return to calling routine

### Arrays vs. Pointers

- Array indexing involves
  - Multiplying index by element size
  - Adding to array base address
- Pointers correspond directly to memory addresses
  - Can avoid indexing complexity

#### **Example: Clearing and Array**

| <pre>clear1(int array[], int size) {     int i;     for (i = 0; i &lt; size; i += 1)         array[i] = 0; }</pre> | <pre>clear2(int *array, int size) {     int *p;     for (p = &amp;array[0]; p &lt; &amp;array[size];         p = p + 1)         *p = 0; }</pre> |
|--------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------|
| <pre>move \$t0,\$zero # i = 0 loop1: sll \$t1,\$t0,2 # \$t1 = i * 4 add \$t2,\$a0,\$t1 # \$t2 =</pre>              | <pre>move \$t0,\$a0  # p = &amp; array[0] sll \$t1,\$a1,2  # \$t1 = size * 4 add \$t2,\$a0,\$t1 # \$t2 =</pre>                                  |
| <pre># (i &lt; size) bne \$t3,\$zero,loop1 # if () # goto loop1</pre>                                              | #(p<&array[size])<br>bne \$t3,\$zero,loop2 # if (…)<br># goto loop2                                                                             |

## **Comparison of Array vs. Ptr**

- Multiply "strength reduced" to shift
- Array version requires shift to be inside loop
  - Part of index calculation for incremented i
  - c.f. incrementing pointer
- Compiler can achieve same effect as manual use of pointers
  - Induction variable elimination
  - Better to make program clearer and safer

### Summary

| Instruction class | MIPS examples                        | SPEC2006 Int | SPEC2006 FP |
|-------------------|--------------------------------------|--------------|-------------|
| Arithmetic        | add, sub, addi                       | 16%          | 48%         |
| Data transfer     | lw, sw, lb, lbu,<br>lh, lhu, sb, lui | 35%          | 36%         |
| Logical           | and, or, nor, andi,<br>ori, sll, srl | 12%          | 4%          |
| Cond. Branch      | beq, bne, slt,<br>slti, sltiu        | 34%          | 8%          |
| Jump              | j, jr, jal                           | 2%           | 0%          |

#### Backup

# **ARM & MIPS Similarities**

- ARM: the most popular embedded core
- Similar basic set of instructions to MIPS

|                       | ARM              | MIPS             |  |  |
|-----------------------|------------------|------------------|--|--|
| Date announced        | 1985             | 1985             |  |  |
| Instruction size      | 32 bits          | 32 bits          |  |  |
| Address space         | 32-bit flat      | 32-bit flat      |  |  |
| Data alignment        | Aligned          | Aligned          |  |  |
| Data addressing modes | 9                | 3                |  |  |
| Registers             | 15 × 32-bit      | 31 × 32-bit      |  |  |
| Input/output          | Memory<br>mapped | Memory<br>mapped |  |  |

# **Compare and Branch in ARM**

- Uses condition codes for result of an arithmetic/logical instruction
  - Negative, zero, carry, overflow
  - Compare instructions to set condition codes without keeping the result
- Each instruction can be conditional
  - Top 4 bits of instruction word: condition value
  - Can avoid branches over single instructions

#### **Instruction Encoding**

|                   | ARM  | 31 | 28<br>Opx <sup>4</sup> | 27 | Op <sup>8</sup>        |     | 20  | 19<br>Rs1 |                      | 15   | 12<br>Rd <sup>4</sup> | 11              | Opx <sup>8</sup>    |                    | 4 3 | B<br>Rs2 <sup>4</sup> | 0 |
|-------------------|------|----|------------------------|----|------------------------|-----|-----|-----------|----------------------|------|-----------------------|-----------------|---------------------|--------------------|-----|-----------------------|---|
| Register-register |      | 31 |                        | 26 |                        | 21  | 20  |           |                      | 15   |                       | 11              |                     | 65                 |     |                       | 0 |
|                   | MIPS |    | Op <sup>6</sup>        |    | Rs1⁵                   |     |     | Rs2⁵      |                      |      | Rd⁵                   |                 | Const⁵              |                    | Op  | ox <sup>6</sup>       |   |
|                   |      |    |                        |    |                        |     |     |           |                      |      |                       |                 |                     |                    |     |                       |   |
|                   | ARM  | 31 | 28<br>Opx <sup>4</sup> | 27 | Op <sup>8</sup>        |     | 20  | 19<br>Rs1 |                      | 15   | 12<br>Rd <sup>4</sup> | 11              | Co                  | onst <sup>12</sup> |     |                       | 0 |
| Data transfer     |      | 31 |                        | 26 | 25                     | 21  | 20  |           | 16                   | 15   |                       |                 |                     |                    |     |                       | 0 |
|                   | MIPS |    | $Op^6$                 |    | Rs1⁵                   |     |     | Rd⁵       |                      |      |                       |                 | Const <sup>16</sup> |                    |     |                       |   |
|                   |      |    |                        |    |                        |     |     |           |                      |      |                       |                 |                     |                    |     |                       |   |
|                   | ARM  | 31 | 28<br>Opx <sup>4</sup> | 27 | 24 23                  |     |     |           |                      |      | Cons                  | t <sup>24</sup> |                     | _                  |     |                       | 0 |
| Branch            |      |    |                        |    |                        |     |     |           |                      |      |                       |                 |                     |                    |     |                       | _ |
|                   | MIPS | 31 | Op <sup>6</sup>        | 26 | 25<br>Rs1 <sup>5</sup> |     | 20  | Dpx⁵/Rs   | 16<br>2 <sup>5</sup> | 15   |                       |                 | Const <sup>16</sup> |                    |     |                       | 0 |
|                   |      |    |                        |    |                        |     |     |           |                      |      |                       | _               |                     |                    |     |                       |   |
|                   |      | 31 |                        | 27 | 24 23                  |     |     |           |                      |      |                       |                 |                     |                    |     |                       | 0 |
|                   | ARM  |    | Opx <sup>4</sup>       | O  | p <sup>4</sup>         |     |     |           |                      |      | Cons                  | t <sup>24</sup> |                     |                    |     |                       |   |
| Jump/Call         |      | 31 | - 6                    | 26 | 25                     |     |     |           |                      |      | - 26                  |                 |                     |                    |     |                       | 0 |
|                   | MIPS |    | Op <sup>6</sup>        |    |                        |     |     |           |                      |      | Const <sup>26</sup>   |                 |                     |                    |     |                       |   |
|                   |      |    |                        |    |                        | ∎ C | рсо | de 🗆      | Regi                 | ster | 🛛 Con                 | stan            | ıt                  |                    |     |                       |   |

#### The Intel x86 ISA

- Evolution with backward compatibility
  - 8080 (1974): 8-bit microprocessor
    - Accumulator, plus 3 index-register pairs
  - 8086 (1978): 16-bit extension to 8080
    - Complex instruction set (CISC)
  - 8087 (1980): floating-point coprocessor
    - Adds FP instructions and register stack
  - 80286 (1982): 24-bit addresses, MMU
    - Segmented memory mapping and protection
  - 80386 (1985): 32-bit extension (now IA-32)
    - Additional addressing modes and operations
    - Paged memory mapping as well as segments

## The Intel x86 ISA

- Further evolution...
  - i486 (1989): pipelined, on-chip caches and FPU
    - Compatible competitors: AMD, Cyrix, ...
  - Pentium (1993): superscalar, 64-bit datapath
    - Later versions added MMX (Multi-Media eXtension) instructions
    - The infamous FDIV bug
  - Pentium Pro (1995), Pentium II (1997)
    - New microarchitecture (see Colwell, The Pentium Chronicles)
  - Pentium III (1999)
    - Added SSE (Streaming SIMD Extensions) and associated registers
  - Pentium 4 (2001)
    - New microarchitecture
    - Added SSE2 instructions

# The Intel x86 ISA

- And further...
  - AMD64 (2003): extended architecture to 64 bits
  - EM64T Extended Memory 64 Technology (2004)
    - AMD64 adopted by Intel (with refinements)
    - Added SSE3 instructions
  - Intel Core (2006)
    - Added SSE4 instructions, virtual machine support
  - AMD64 (announced 2007): SSE5 instructions
    - Intel declined to follow, instead...
  - Advanced Vector Extension (announced 2008)
    - Longer SSE registers, more instructions
- If Intel didn't extend with compatibility, its competitors would!
  - Technical elegance ≠ market success

#### **Basic x86 Registers**



## **Basic x86 Addressing Modes**

#### • Two operands per instruction

| Source/dest operand | Second source operand |
|---------------------|-----------------------|
| Register            | Register              |
| Register            | Immediate             |
| Register            | Memory                |
| Memory              | Register              |
| Memory              | Immediate             |

- Memory addressing modes
  - Address in register
  - Address = R<sub>base</sub> + displacement
  - Address =  $R_{base}$  +  $2^{scale} \times R_{index}$  (scale = 0, 1, 2, or 3)
  - Address =  $R_{base} + 2^{scale} \times R_{index} + displacement$

## x86 Instruction Encoding

| a. JE EIP ·        | + disp             | lacement                           |                   |     |  |  |  |  |
|--------------------|--------------------|------------------------------------|-------------------|-----|--|--|--|--|
| 4                  | 4                  | 8                                  |                   |     |  |  |  |  |
|                    | ondi-<br>tion      | Displacement                       |                   |     |  |  |  |  |
| b. CALL            |                    |                                    | 32                |     |  |  |  |  |
| 8                  |                    |                                    |                   |     |  |  |  |  |
| CALL               | -                  |                                    | Offs              | set |  |  |  |  |
| c. MOV<br>6<br>MOV | EBX,<br>1 1<br>d w | [EDI + 45]<br>8<br>r/m<br>Postbyte | 8<br>Displacement |     |  |  |  |  |
| d. PUSH E          | SI                 |                                    |                   |     |  |  |  |  |
| 5                  | 5 3                |                                    |                   |     |  |  |  |  |
| PUSH Reg           |                    |                                    |                   |     |  |  |  |  |
|                    | e. ADD EAX, #6765  |                                    |                   |     |  |  |  |  |
| 4 ;                | 3 1                |                                    | 32                | 2   |  |  |  |  |

| 4   | 3   | 1 | 52        |
|-----|-----|---|-----------|
| ADD | Rea | w | Immediate |
|     |     |   |           |

f. TEST EDX, #42

| <br>7 | 1 | 8        | 32        |
|-------|---|----------|-----------|
| TEST  | w | Postbyte | Immediate |

- Variable length encoding
  - Postfix bytes specify addressing mode
  - Prefix bytes modify operation
    - Operand length, repetition, locking, ...

# **Implementing IA-32**

- Complex instruction set makes implementation difficult
  - Hardware translates instructions to simpler microoperations
    - Simple instructions: 1–1
    - Complex instructions: 1–many
  - Microengine similar to RISC
  - Market share makes this economically viable
- Comparable performance to RISC
  - Compilers avoid complex instructions