# BLM6112 Advanced Computer Architecture

Instruction Set Architecture

#### **Prof. Dr. Nizamettin AYDIN**

#### naydin@yildiz.edu.tr

http://www3.yildiz.edu.tr/~naydin



## **Instruction Set**

- Instruction: Language of the machine
- Instruction set: Vocabulary of the language (Collection of instructions that are understood by a CPU)
- lda, sta, brp, jmp, nop, ... Machine Code
- Machine Code
   machine readable
  - Binary(example: 1000110010100000)
- · Usually represented by assembly codes

Human readable • Example: adding a number entered from keyboard and a number in memory location 40

| 0 | in  |    |
|---|-----|----|
| 1 | sta | 30 |
| 2 | add | 40 |
| 3 | sta | 50 |
| 4 | hlt |    |

## **Instruction Types**

- Data processing – ADD, SUB
- Data storage (main memory) - STA
- Data movement (I/O) - IN, OUT, LDA
- Program flow control

   BRZ

## **Elements of an Instruction**

- Operation code (Op-code)
  - Do this
  - Example: ADD 30
- Source Operand reference
  - To this
    - Example: LDA 50
- · Result Operand reference
  - Put the result here
  - Example: STA 60
- Next Instruction Reference
  - When you have done that, do this...
  - PC points to the next instruction

## **Source and Result Operands**

- Source and Result Operands can be in one of the following areas:
  - Main memory
  - Virtual memory
  - Cache
  - CPU register
  - I/O device

## **Instruction Representation**

- In machine code each instruction has a unique bit pattern
- For human consumption a symbolic representation is used (assembly language)
- Opcodes are represented by abbreviations, called mnemonics indicating the operation
  - ADD, SUB, LDA, BRP, ...
- In an assembly language, operands can also be represented as following
  - ADD A,B (add contents of B and A and save the result into A)

## **Simple Instruction Format**

• Following is a 16 bit instruction format



- So...
  - What is the maximum number of instructions in this processor?
  - What is the maximum directly addressable memory size?

## **Instruction Set Classification**

• One way for classification:

 Number of operands for typical arithmetic instruction



- Will use this C statement as an example:

$$a = b + c$$

– Assume **a**, **b** and **c** are in memory

## Zero Address Machine

- a.k.a. Stack Machines
- Example: a = b + c;

| PUSH b | <pre># Push b onto stack</pre>    |
|--------|-----------------------------------|
| PUSH c | # Push c onto stack               |
| ADD    | # Add top two items               |
|        | <pre># on stack and replace</pre> |
|        | # with sum                        |
| POP a  | <pre># Remove top of stack</pre>  |
|        | <pre># and store in a</pre>       |

## **One Address Machine**

- a.k.a. Accumulator Machine
- One operand is implicitly the accumulator
- Example: a = b + c;

| LOAD  | b | # | ACC | ← | ь   |   |   |
|-------|---|---|-----|---|-----|---|---|
| ADD   | с | # | ACC | ← | ACC | + | с |
| STORE | a | # | a   | ← | ACC |   |   |

## **Two Address Machine (1)**

- a.k.a. Register-Memory Instruction Set
- One operand may be a value from memory
- Machine has n general purpose registers

   \$0 through \$n-1

• Example: a = b + c;

11

LOAD \$1, b # \$1  $\leftarrow$  M[b] ADD \$1, c # \$1  $\leftarrow$  \$1 + M[c] STORE \$1, a # M[a]  $\leftarrow$  \$1 10

## Two Address Machine (2)

- a.k.a. Memory-Memory Machine
- Another possibility do stuff in memory!
- These machines have registers used to compute memory addresses
- 2 addresses (One address doubles as operand and result)
- Example: a = b + c:

| MOVE | a, | b | # | M[a] | ← | M[b] |        |
|------|----|---|---|------|---|------|--------|
| ADD  | a, | с | # | M[a] | ← | M[a] | + M[c] |

### **Two Address Machine (3)**

- a.k.a. Load-Store Instruction Set or Register-**Register Instruction Set**
- Typically can only access memory using load/store instructions
- Example: a = b + c:

\$1, b LOAD # \$1 ← M[b] # \$2 ← M[c] LOAD \$2, c ADD \$1, \$2 # \$1 ← \$1 + \$2 STORE \$1, a # M[a] ← \$1

## **Three Address Machine**

- a.k.a. Load-Store Instruction Set or Register-Register Instruction Set
- · Typically can only access memory using load/store instructions
- 3 addresses (Operand 1, Operand 2, Result)
  - May be a forth next instruction (usually implicit)

a = b + c;

- Needs very long words to hold everything
- Example:

| LOAD  | \$1, b        | # \$1 ← M[b]             |
|-------|---------------|--------------------------|
| LOAD  | \$2, c        | # \$2 ← M[c]             |
| ADD   | \$3, \$1, \$2 | <b>#</b> \$3 ← \$1 + \$2 |
| STORE | \$3, a        | # M[a] ← \$3             |

#### Utilization of Instruction Addresses

14

16

18

| Number of Addresses | Symbolic Representation | Interpretation                                            |
|---------------------|-------------------------|-----------------------------------------------------------|
| 3                   | OP A, B, C              | A ← B OP C                                                |
| 2                   | OP A, B                 | $\mathbf{A} \leftarrow \mathbf{A} \text{ OP } \mathbf{B}$ |
| 1                   | OP A                    | AC 🖛 AC OP A                                              |
| 0                   | OP                      | T ← (T - 1) OP T                                          |

= accumulator AC

13

15

17

 $\begin{array}{lll} T & = top \ of \ stack \\ (T-1) & = \ second \ element \ of \ stack \\ A, B, C & = \ memory \ or \ register \ locations \end{array}$ 

## **Types of Operand**

- Addresses
  - Operand is in the address
- Numbers (actual operand)
  - Integer or fixed point
  - floating point
  - decimal
- Characters (actual operand)

- ASCII etc.

• Logical Data (actual operand)

#### - Bits or flags

## **Pentium Data Types**

- 8 bit (byte), 16 bit (word), 32 bit (double word), 64 bit (quad word)
- Addressing in Pentium is by 8 bit units
- A 32 bit double word is read at addresses divisible by 4:

| 0100 | 1A | 22 | F1 | 77 |
|------|----|----|----|----|
|      | +0 | +1 | +2 | +3 |

## **Pentium Numeric Data Formats**



### **PowerPC Data Types**

- 8 (byte), 16 (halfword), 32 (word) and 64 (doubleword) length data types
- Fixed point processor recognises:
  - Unsigned byte, unsigned halfword, signed halfword, unsigned word, signed word, unsigned doubleword, byte string (<128 bytes)</li>
- Floating point
  - IEEE 754

19

21

23

- Single or double precision

## **Types of Operation**

- Data Transfer
- Arithmetic
- Logical
- Conversion
- I/O
- · System Control
- Transfer of Control

# **Data Transfer**

20

22

24

- · Need to specify
  - Source
  - Destination
  - Amount of data
- May be different instructions for different movements
- · Or one instruction and different addresses

## Arithmetic

- Basic arithmetic operations are...
  - Add
  - Subtract
  - Multiply
     Divide
  - Divide
    Increment (a++)
  - Decrement (a--)
  - Negate (-a)
  - Absolute
- Arithmetic operations are provided for...
  - Signed Integer
  - Floating point?Packed decimal numbers?
  - Packed decimal numbers

#### Logical

- · Bitwise operations
- AND, OR, NOT
  - Example1: bit masking using AND operation
    - (R1) = 10100101
    - (R2) = 00001111
    - (R1) AND (R2) = 00000101
  - Example2: taking ones coplement using XOR operation
    - (R1) = 10100101
    - (R2) = 1111111
    - (R1) XOR (R2) = 01011010

## **Basic Logical Operations**

| Р | Q | NOT P | P AND Q | P OR Q | P XOR Q | P=Q |
|---|---|-------|---------|--------|---------|-----|
| 0 | 0 | 1     | 0       | 0      | 0       | 1   |
| 0 | 1 | 1     | 0       | 1      | 1       | 0   |
| 1 | 0 | 0     | 0       | 1      | 1       | 0   |
| 1 | 1 | 0     | 1       | 1      | 0       | 1   |

## **Shift and Rotate Operations**



26

28

#### **Examples of Shift and Rotate Operations**

25

27

29

| Input    | Operation                       | Result   |
|----------|---------------------------------|----------|
| 10100110 | Logical right shift (3 bits)    | 00010100 |
| 10100110 | Logical left shift (3 bits)     | 00110000 |
| 10100110 | Arithmetic right shift (3 bits) | 11110100 |
| 10100110 | Arithmetic left shift (3 bits)  | 10110000 |
| 10100110 | Right rotate (3 bits)           | 11010100 |
| 10100110 | Left rotate (3 bits)            | 00110101 |

#### An example - sending two characters in a word

- Suppose we wish to transmit characters of data to an I/O device, 1 character at a time.
  - If each memory word is 16 bits in length and contains two characters, we must unpack the characters before they can be sent.
- To send the left-hand character:
  - Load the word into a register
  - AND with the value 1111111100000000This masks out the character on the right

#### An example - sending two characters in a word

- Shift to the right eight times
  - This shifts the remaining character to the right half of the register
- Perform I/O
  - The I/O module reads the lower-order 8 bits from the data bus.
- To send the right-hand character:
  - Load the word again into the register
  - AND with 000000011111111
  - Perform I/O

# Conversion

- Conversion instructions are those that change the format or operate on the format of data.
- For example:
  - Binary to Decimal conversion

# Input/Output

- May be specific instructions - IN, OUT
- May be done using data movement instructions (memory mapped)
- May be done by a separate controller (DMA)

## **Systems Control**

- Privileged instructions
- CPU needs to be in specific state
- For operating systems use

31

33

35

## **Transfer of Control**

- Branch - For example: brz 10 (branch to 10 if result is zero)
- Skip - e.g. increment and skip if zero
- Subroutine call - c.f. interrupt call



**Branch Instruction** 

32

## **Nested Procedure Calls**



## **Use of Stack**



Copyright 2000 N. AYDIN. All rights reserved.

# **Types of Operation**

| Type          | Operation Name           | Description                                                                                               | Type                              | Operation Name     | Description                                                                           |
|---------------|--------------------------|-----------------------------------------------------------------------------------------------------------|-----------------------------------|--------------------|---------------------------------------------------------------------------------------|
|               | Move (transfer)          | Transfer word or block from source to destination                                                         |                                   | Jump (branch)      | Unconditional transfer; load PC with specified address                                |
|               | Store                    | Transfer word from processor to memory                                                                    |                                   | Jung Conditional   | Test specified conditions, either load PC with specified                              |
|               | Load (fetch)             | Transfer word from memory to processor                                                                    |                                   |                    | address or do nothing, based on condition                                             |
| Data Transfer | Exchange                 | Swap contents of source and destination                                                                   |                                   | Jump to Subcostine | Place current program control information in known                                    |
| AND THREE     | Clear (reset)            | Transfer word of 0s to destination                                                                        |                                   |                    | location; jump to specified address                                                   |
|               | Set                      | Transfer word of Is to destination                                                                        |                                   | Retern             | Replace contents of PC and other register from known<br>location                      |
|               | Pash                     | Transfer word from source to top of stack                                                                 | -                                 | -                  |                                                                                       |
|               | Pop                      | Transfer word from top of stack to destination                                                            | Transfer of Control               | Executo            | Fetch operand from specified location and enceste as<br>instruction, do not modify PC |
|               | Add                      | Compute sum of two operands                                                                               | two operands<br>to f two operands | Scir               | Increment PC to skip next instruction                                                 |
|               | Subtract                 | Compute difference of two operands                                                                        |                                   | Skin Conditional   | Test specified condition; either skip or do nothing based                             |
|               | Multiply                 | Compute product of two operands                                                                           |                                   | Sep Controls       | in condition                                                                          |
| Arithmetic    | Divide                   | Compute quotient of two operands                                                                          |                                   | Ba                 | Stop program execution                                                                |
|               | Absolute                 | Replace operand by its absolute value                                                                     |                                   | Wait (hold)        | Step program execution; test specified condition                                      |
|               | Negate                   | Change sign of operand                                                                                    |                                   |                    | repeatedly; more energies when condition is satisfie                                  |
|               | Increment                | Add 1 to operand                                                                                          |                                   | No operation       | No operation is performed, but program execution is                                   |
|               | Decrement                | Subtract 1 from operand                                                                                   |                                   |                    | continued                                                                             |
| AND           | in the                   | Perform logical AND                                                                                       |                                   | Input (read)       | Transfer data from specified 1/0 port or device to                                    |
|               | OR                       | Perform logical OR                                                                                        |                                   |                    | destination (e.g., main memory or processor register)                                 |
|               | NOT (complement)         | Perform logical NOT                                                                                       |                                   | Output (write)     | Transfer data from specified scence to 1/0 port or device                             |
|               | Exclusive-OR             | Perform logical XOR                                                                                       | Input/Output                      | Star 10            | Transfer instructions to I/O processor to initiate I/O                                |
|               | Test                     | Test specified condition; set flag(s) based on outcome                                                    |                                   |                    | operation                                                                             |
| Logical       | Compare                  | Make logical or arithmetic comparison of two or more<br>operands; set flag(s) based on outcome            |                                   | Test 10            | Transfer status information from 10 system to specifies<br>destruction                |
|               | Set Control<br>Variables | Class of instructions to set controls for protection<br>purposes, interrupt handling, timer control, etc. |                                   | Traslate           | Translate values in a section of memory based on a table<br>of correspondences        |
|               | Shift                    | Left (right) shift operand, introducing constants at end                                                  | Conversion                        | Courseit           | Convert the contents of a word from one form to anothe                                |
|               | Rotate                   | Left (right) shift operand, with wrapperend and                                                           |                                   |                    | (e.g., packed decimal to hisary)                                                      |

## **CPU Actions for Various Types of Operations**

|                     | Transfer data from one location to another                                                                                                                 |  |  |  |
|---------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|
| Data Transfer       | If memory is involved:<br>Determine memory address<br>Perform virtual-to-actual-memory address transformation<br>Check cache<br>Initiate memory read/write |  |  |  |
|                     | May involve data transfer, before and/or after                                                                                                             |  |  |  |
| Arithmetic          | Perform function in ALU                                                                                                                                    |  |  |  |
|                     | Set condition codes and flags                                                                                                                              |  |  |  |
| Logical             | Same as arithmetic                                                                                                                                         |  |  |  |
| Conversion          | Similar to arithmetic and logical. May involve special logic to<br>perform conversion                                                                      |  |  |  |
| Transfer of Control | Update program counter. For subroutine call/return, manage<br>parameter passing and linkage                                                                |  |  |  |
| 1/0                 | Issue command to I/O module                                                                                                                                |  |  |  |
| 10                  | If memory-mapped I/O, determine memory-mapped address                                                                                                      |  |  |  |

## **Pentium Operation Types**

| Instruction | Description                                                                                                                                       |           | String Operations                                                                    |
|-------------|---------------------------------------------------------------------------------------------------------------------------------------------------|-----------|--------------------------------------------------------------------------------------|
|             | Data Movement                                                                                                                                     | MOVS      | Move byte, word, dword string. The instruction operates on one element of a string.  |
| MOV         | Move operand, between registers or between register and memory.                                                                                   | 1         | indexed by registers ESI and EDI. After each string operation, the registers are     |
| PUSH        | Posh-operand onto stuck.                                                                                                                          |           | automatically incremented or decremented to point to the next element of the string. |
| PUSHA       | Pash all registers on stack.                                                                                                                      | LODS      | Load byte, word, dword of string.                                                    |
| MOVSX       | Move byte, word, dword, sign entended. Moves a byte to a word or a word to a                                                                      |           | High-Level Language Support                                                          |
|             | doublewood with twos-complement sign extension.                                                                                                   | ENTER     | Creates a stack frome that can be used to implement the rules of a block-structured  |
| LEA         | Load effective address. Loads the offset of the source operand, rather than its value                                                             |           | high-level language.                                                                 |
|             | to the destination operand.                                                                                                                       | LEAVE     | Reverses the action of the previous ENTER.                                           |
| NLAT        | Table lookup translation. Replaces a byte in AL with a byte from a user-coded                                                                     | BOUND     | Check array bounds. Verifies that the value in operand 1 is within lower and upper   |
|             | translation table. When XLAT is executed, AL should have an unsigned index to the                                                                 | Doorto    | limits. The limits are in two adjacent memory locations referenced by operand 2. An  |
|             | table. XLAT changes the contents of AL from the table index to the table entry.                                                                   |           | internot occurs if the value is out of bounds. This instruction is used to check an  |
| IN.OUT      | Input, output operand from 10 space.                                                                                                              |           | antar index.                                                                         |
|             | Arithmetic                                                                                                                                        | -         | Flag Control                                                                         |
| ADD         | Add operands.                                                                                                                                     | STC       | Set Carry flag.                                                                      |
|             | Subtract operands.                                                                                                                                | LAHE      | Load A register from flags. Copies SF, ZF, AF, PF, and CF bits into A register.      |
| MUL         | Unsigned integer multiplication, with byte, word, or double word openads, and                                                                     | Linne     | Segment Register                                                                     |
| IDIV        | word, doubleword, or quadword result.                                                                                                             | 105       | Load ocinter into D segment register.                                                |
| IDIV        | Signed firste.                                                                                                                                    | LUS       |                                                                                      |
|             | Lopical                                                                                                                                           |           | System Control                                                                       |
| AND         | AND operands.<br>Bit text and set. Operates on a bit field operand. The instruction conies the current                                            | HLT       | Halt                                                                                 |
| B12         | Bit test and set. Operates on a ful held operand. The instruction copies the current<br>value of a bit to flag CF and sets the original bit to 1. | LOCK      | Asserts a hold on shared memory so that the Pentism has enclusive use of it during   |
| RSF         | Value of a bit to mag UP and sets the original bit to 1.<br>Rit scan forward. Scans a word or doubleword for a 1-bit and stores the earther of    | -         | the instruction that immediately follows the LOCK.                                   |
| 8.55        | Bit scan forward, Scans a word or doublewood for a 1-bit and stores the number of<br>the first 1-bit into a register.                             | ESC       | Processor extension escape. An escape code that indicates the succeeding             |
| SHLISHR     | Shift logical left or right.                                                                                                                      | -         | instructions are to be executed by a numeric coprocessor that supports high-         |
| SALISAR     | Statt togeta tert or right.<br>Shift withmetic left or right.                                                                                     | -         | precision integer and flooting-point calculations.                                   |
| ROL/ROR     | Rotate left or right.                                                                                                                             | WAIT      | Wait werd BUSY# negated. Suspends Pentians program execution word the                |
| SFLee       | Notate and or right.<br>Sets a lotte to zero or one depending on any of the 16 conditions defined by status                                       | -         | processor detects that the BUSY pin is inactive, indicating that the numeric         |
| SEICC       | Sets a oye to zero or one depending on any or me to commons dennes by scans                                                                       |           | coprocessor has finished execution.                                                  |
|             | Cantral Transfer                                                                                                                                  | -         | Protection                                                                           |
| IMP         | Unconditional imp.                                                                                                                                | SGDT      | Store global descriptor table.                                                       |
| CALL        | Transfer control to another location. Before transfer, the address of the instruction                                                             | LSL       | Load segment limit, Loads a user-specified register with a segment limit.            |
| COLL        | following the CALL is placed on the stack.                                                                                                        | VERR/VERW | Verify seament for reading/writing.                                                  |
| JE/17.      | Jump if coul/zero.                                                                                                                                | -         | Cache Management                                                                     |
| LOOPELOOPZ  | Loops if equal/zero. This is a conditional immu using a value stored in register ECX.                                                             | INVD      | Finihes the internal cache memory.                                                   |
|             | The instruction first decrements ECX before testing ECX for the branch condition.                                                                 | WEINVD    | Flushes the internal cache memory after writing dary lines to memory.                |
| INT/INTO    | Interrupt Interrupt if overflow. Transfer control to an interrupt service rotrine                                                                 | INVLPG    | Implidates a translation lookaside buffer (TLR) entry                                |

# **Pentium Condition Codes**

| Status Bit | Name            | Description                                                                                                                                                        |
|------------|-----------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| С          | Carry           | Indicates carrying or borrowing into the left-most bit position<br>following an arithmetic operation. Also modified by some of<br>the shift and rotate operations. |
| Р          | Parity          | Parity of the result of an arithmetic or logic operation. 1 indicates even parity; 0 indicates odd parity.                                                         |
| А          | Auxiliary Carry | Represents carrying or borrowing between half-bytes of an 8-bit arithmetic or logic operation using the AL register.                                               |
| Z          | Zero            | Indicates that the result of an arithmetic or logic operation is 0.                                                                                                |
| S          | Sign            | Indicates the sign of the result of an arithmetic or logic<br>operation.                                                                                           |
| 0          | Overflow        | Indicates an arithmetic overflow after an addition or subtraction.                                                                                                 |

#### Pentium Conditions for Conditional Jump and SETcc Instructions

| Symbol             | Condition Tested                              | Comment                                                                   |  |  |  |  |  |
|--------------------|-----------------------------------------------|---------------------------------------------------------------------------|--|--|--|--|--|
| A, NBE C=0 AND Z=0 |                                               | Above; Not below or equal (greater than,<br>unsigned)                     |  |  |  |  |  |
| AE, NB, NC         | C=0                                           | Above or equal; Not below (greater than or<br>equal, unsigned); Not carry |  |  |  |  |  |
| B, NAE, C          | C=1                                           | Below; Not above or equal (less than,<br>unsigned); Carry set             |  |  |  |  |  |
| BE, NA             | C=1 OR Z=1                                    | Below or equal; Not above (less than or<br>equal, unsigned)               |  |  |  |  |  |
| E, Z               | Z=1                                           | Equal; Zero (signed or unsigned)                                          |  |  |  |  |  |
| G, NLE             | [(S=1 AND O=1) OR (S=0<br>and O=0)] AND [Z=0] | Greater than; Not less than or equal (signed                              |  |  |  |  |  |
| GE, NL             | (S=1 AND O=1) OR (S=0<br>AND O=0)             | Greater than or equal; Not less than (signed                              |  |  |  |  |  |
| L, NGE             | (S=1 AND O=0) OR (S=0<br>AND O=1)             | Less than; Not greater than or equal (signe                               |  |  |  |  |  |
| LE, NG             | (S=1 AND O=0) OR (S=0<br>AND O=1) OR (Z=1)    | Less than or equal; Not greater than (signed)                             |  |  |  |  |  |
| NE, NZ             | Z=0                                           | Not equal; Not zero (signed or unsigned)                                  |  |  |  |  |  |
| NO                 | O=0                                           | No overflow                                                               |  |  |  |  |  |
| NS                 | S=0                                           | Not sign (not negative)                                                   |  |  |  |  |  |
| NP, PO             | P=0                                           | Not parity; Parity odd                                                    |  |  |  |  |  |
| 0                  | O=1                                           | Overflow                                                                  |  |  |  |  |  |
| P                  | P=1                                           | Parity; Parity even                                                       |  |  |  |  |  |
| S                  | S=1                                           | Sign (negative)                                                           |  |  |  |  |  |

# **MMX Instruction Set**

| Category      | Instruction             | Description                                                                                                                         |  |  |  |  |  |
|---------------|-------------------------|-------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|
|               | PADD [B, W, D]          | Parallel add of packed eight bytes, four 16-bit words, or two 32-bit<br>doublewords, with wrapproad.                                |  |  |  |  |  |
|               | PADDS (B. W)            | Add with saturation.                                                                                                                |  |  |  |  |  |
|               | PADDUS (B. WI           | Add unsigned with same Mion                                                                                                         |  |  |  |  |  |
|               | PSUB [B, W, D]          | Subtract with wrapiacound                                                                                                           |  |  |  |  |  |
|               | PSUBS (B. W)            | Subtract with saturation                                                                                                            |  |  |  |  |  |
| Arithmetic    | PSUBUS (B. W)           | Subtract unsigned with saturation                                                                                                   |  |  |  |  |  |
|               | PMULHW                  | Parallel multiply of four signed 16-bit words, with high-order 16<br>bits of 32-bit result chosen.                                  |  |  |  |  |  |
|               | PMULLW                  | Parallel multiply of four signed 16-bit words, with low-order 16 bit<br>of 32-bit senilt choses.                                    |  |  |  |  |  |
|               | PMADDWD                 | Parallel smittply of four signed 16-bit words, add together adjacent<br>pairs of 32-bit results.                                    |  |  |  |  |  |
|               | PCMPEQ [B. W. D]        | Parallel compare for equality, sesult is mask of 1s if true or 0s if faise.                                                         |  |  |  |  |  |
| Compasison    | PCMPGT [B, W, D]        | Parallel compare for greater than; result is mask of 1s if true or 0s a false                                                       |  |  |  |  |  |
|               | PACKUSWB                | Pack words into bytes with unsigned saturation.                                                                                     |  |  |  |  |  |
|               | PACKSS [WB, DW]         | Pack words into bytes, or doublewords into words, with signed saturation.                                                           |  |  |  |  |  |
| Conversion    | PUNPCKH [BW, WD,<br>DOI | Parallel impack (interfeaved merge) high-order bytes, words, or<br>doublewords from MMX resister.                                   |  |  |  |  |  |
|               | PUNPCKL (BW, WD,<br>DOI | Parallel unpack (interleaved merge) low-order bytes, words, or<br>doublewords from MMX register.                                    |  |  |  |  |  |
|               | PAND                    | 64-bit bitwise logical AND                                                                                                          |  |  |  |  |  |
|               | PNDN                    | 64-bit bitwise logical AND NOT                                                                                                      |  |  |  |  |  |
| Logical       | POR                     | 64-bit himuse logical OR                                                                                                            |  |  |  |  |  |
|               | PXOR                    | 64-bit turvise logical XOR                                                                                                          |  |  |  |  |  |
|               | PSLL [W, D, Q]          | Parallel logical left shift of packed words, doublewords, or<br>quadword by amount specified in MMX register or immediate<br>value. |  |  |  |  |  |
| Shift         | PSRL [W, D, Q]          | Parallel logical right shift of packed words, doublewords, or<br>onadword.                                                          |  |  |  |  |  |
|               | PSRA [W. D]             | Parallel authinetic right shift of packed words, doublewords, or<br>enadword.                                                       |  |  |  |  |  |
| Dota Transfer | MOV [D.Q]               | Move doubleword or quadword to/from MMX resister.                                                                                   |  |  |  |  |  |
| State Mgt     | EMMS                    | Empty MMX state (empty FP register) tag bits).                                                                                      |  |  |  |  |  |

Copyright 2000 N. AYDIN. All rights reserved.

# **PowerPC Operation Types**

| Instruction | Description                                                                                                                              |  |  |  |  |  |  |  |
|-------------|------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|--|--|
|             | Branch-Oriented                                                                                                                          |  |  |  |  |  |  |  |
| b           | Unconditional branch                                                                                                                     |  |  |  |  |  |  |  |
| ы           | Branch to target address and place effective address of instruction following the brand<br>into the Link Register.                       |  |  |  |  |  |  |  |
| be          | Branch conditional on Count Register and/or on bit in Condition Register.                                                                |  |  |  |  |  |  |  |
| 56          | System call to invoke an operating system service                                                                                        |  |  |  |  |  |  |  |
| trap        | Compare two operands and invoke system trap handler if specified conditions are met.                                                     |  |  |  |  |  |  |  |
|             | Lond/Store                                                                                                                               |  |  |  |  |  |  |  |
| lwzu        | Load word and zero extend to left; update source register.                                                                               |  |  |  |  |  |  |  |
| 1d          | Load doubleword.                                                                                                                         |  |  |  |  |  |  |  |
| lmw         | Load multiple word; load consecutive words into contiguous registers from the target<br>register through general-purpose register 31.    |  |  |  |  |  |  |  |
| ls=x        | Load a string of bytes into registers beginning with target register; 4 bytes per registe<br>wrap around from register 31 to register 0. |  |  |  |  |  |  |  |
|             | Integer Arithmetic                                                                                                                       |  |  |  |  |  |  |  |
| add         | Add contents of two registers and place in third register.                                                                               |  |  |  |  |  |  |  |
| subf        | Subtract contents of two registers and place in third register.                                                                          |  |  |  |  |  |  |  |
| mullw       | Multiply low-order 32-bit contents of two registers and place 64-bit product in third register.                                          |  |  |  |  |  |  |  |
| divd        | Divide 64-bit contents of two registers and place in quotient in third register.                                                         |  |  |  |  |  |  |  |
|             | Logical and Shift                                                                                                                        |  |  |  |  |  |  |  |
| emp         | Compare two operands and set four condition bits in the specified condition register field.                                              |  |  |  |  |  |  |  |
| crand       | Condition register AND: two bits of the Condition Register are ANDed and the result<br>placed in one of the two bit positions.           |  |  |  |  |  |  |  |
| and         | AND contents of two registers and place in third register.                                                                               |  |  |  |  |  |  |  |
| entlzd      | Count number of consecutive 0 bits starting at bit zero in source register and place count<br>in destination register.                   |  |  |  |  |  |  |  |
| rldie       | Rotate left doubleword register, AND with mask, and store in destination register.                                                       |  |  |  |  |  |  |  |
| sld         | Shift left bits in source register and store in destination register.                                                                    |  |  |  |  |  |  |  |

# **PowerPC Operation Types**

|       | Floating-Point                                                                                                    |
|-------|-------------------------------------------------------------------------------------------------------------------|
| lfs   | Load 32-bit floating-point number from memory, convert to 64-bit format, and<br>store in floating-point register. |
| fadd  | Add contents of two registers and place in third register.                                                        |
| fmadd | Multiply contents of two registers, add the contents of a third, and place result in<br>fourth register.          |
| fempu | Compare two floating-point operands and set condition bits.                                                       |
|       | Cache Management                                                                                                  |
| dcbf  | Data cache block flush; perform lookup in cache on specified target address and<br>perform flushing operation.    |
| icbi  | Instruction cache block invalidate                                                                                |

## **Byte Ordering**

- How should bytes within multi-byte word be ordered in memory?
- Some conventions
  - Sun's, Mac's are "Big Endian" machines
    Least significant byte has highest address
  - Alphas, PC's are "Little Endian" machines
  - Least significant byte has lowest address

# **Byte Ordering Example**

44

- Big Endian - Least significant byte has highest address
- Little Endian

   Least significant byte has lowest address
- Example

43

45

Variable x has 4-byte representation 0x01234567
 Address given by &x is 0x100

0x100 0x101 0x102 0x103

Big Endian

|               |  |  | 01    | 23    | 45    | 67    |  |
|---------------|--|--|-------|-------|-------|-------|--|
| Little Endian |  |  | 0x100 | 0x101 | 0x102 | 0x103 |  |
|               |  |  | 67    | 45    | 23    | 01    |  |

# **Representing Integers**



## **Representing Pointers**



Different compilers & machines assign different locations to objects

## **Representing Floats**



Not same as integer representation, but consistent across machines Can see some relation to integer representation, but not obvious

## **Representing Strings**



....

52

54

## **Example of C Data Structure**

|         | str | uct { |       |       |        |               |      |      |            |     |        |       |        |            |     |       |     |         |
|---------|-----|-------|-------|-------|--------|---------------|------|------|------------|-----|--------|-------|--------|------------|-----|-------|-----|---------|
|         | i   | nt    |       | a;    | /      | / 0x          | 111: | 2_13 | 14         |     |        |       | wore   | đ          |     |       |     |         |
|         | i   | nt    | 1     | pad;  | : /    | 7             |      |      |            |     |        |       |        |            |     |       |     |         |
|         | ć   | loub  | le 1  | o;    | /      | / 050         | 212: | 2_23 | 24_2526_27 | 28  |        |       | doul   | blev       | ord |       |     |         |
|         | c   | har   | * .   | с;    | - /    | / 0x          | 313: | 2_33 | 34         |     |        |       | wore   | đ          |     |       |     |         |
|         | c   | har   |       | 4[7]  | 17 /   | / 'A          | 1.13 | a',' | C','D','E' | ,'F | ', 'G  |       | byt    | e ar       | ray |       |     |         |
|         | s   | hort  | t (   | e;    | /      | / 0x          | 515  | 2    |            |     |        |       | ha1    | fwor       | d   |       |     |         |
|         | i   | nt    |       | £;    |        | / 050         | 616  | 1_63 | 64         |     |        |       | wore   | 6          |     |       |     |         |
|         | } s | ;     |       |       |        |               |      |      |            |     |        |       |        |            |     |       |     |         |
|         |     |       |       |       |        |               |      |      |            |     |        |       |        |            |     |       |     |         |
|         |     | Big.  | endia |       | Idree  | e ma          | nnin |      |            |     | Little | - ond | lion d | whete re   |     | annir |     |         |
| Byte    | _   | -     |       | an av | iui ca | o nia         | PP m | 6    |            | _   | Littu  |       | uant a | mun        |     | appn  | 46  | Byte    |
| Address | 11  | 12    | 13    | 14    |        |               |      |      |            |     |        |       |        | 11         | 12  | 13    | 14  | Address |
| 00      | 00  | 01    | 02    | 03    | 04     | 05            | 06   | 07   |            | 07  | 06     | 05    | 04     | 03         | 02  | 01    | 00  | 00      |
|         | 21  | 22    | 23    | 24    | 25     | 26            | 27   | 28   |            | 21  | 22     | 23    | 24     | 25         | 26  | 27    | 28  |         |
| 08      | 08  | 09    | 0A    | 0B    | 0C     | $0\mathbf{D}$ | 0E   | 0F   |            | 0F  | 0E     | 0D    | 0C     | 0 <b>B</b> | 0A  | 09    | 08  | 08      |
|         | 31  | 32    | 33    | 34    | 'A'    | 'B'           | 'C'  | 'D'  |            | 'D' | 'C'    | 'B'   | 'A'    | 31         | 32  | 33    | 34  |         |
| 10      | 10  | 11    | 12    | 13    | 14     | 15            | 16   | 17   |            | 17  | 16     | 15    | 14     | 13         | 12  | 11    | 10  | 10      |
|         | 'E' | 'F'   | 'G'   |       | 51     | 52            |      |      |            |     |        | 51    | 52     |            | 'G' | 'F'   | 'E' |         |
| 18      | 18  | 19    | 14    | 18    | 1C     | 1D            | 18   | 1P   |            | 18  | 18     | 1D    | 1C     | 18         | 1A  | 19    | 18  | 18      |
|         | 61  | 62    | 63    | 64    |        |               |      |      |            |     |        |       |        | 61         | 62  | 63    | 64  |         |
| 20      | 20  | 21    | 22    | 23    |        |               |      |      |            |     |        |       |        | 23         | 22  | 21    | 20  | 20      |

Common file formats and their endian order

- Adobe Photoshop -- Big Endian
- Motor Fintosiop -- Dig Entain BMP (Windows and OS/2 Bitmaps) -- Little Endian DXF (AutoCad) -- Variable GIF -- Little Endian

- IMG (GEM Raster) -- Big Endian
- JPEG -- Big Endian
- FLI (Autodesk Animator) -- Little Endian MacPaint -- Big Endian PCX (PC Paintbrush) -- Little Endian
- PostScript -- Not Applicable (text!) POV (Persistence of Vision ray-tracer) -- Not Applicable (text!)
- POV (Persistence of vision ray-tracer) Not Applicat QTM (Quicktime Movies) Little Endian (on a Mac!) Microsoft RIFF (WAV & .AVI) -- Both Microsoft RIFF (Rich Text Format) -- Little Endian SGI (Silicon Graphics) -- Big Endian Sun Raster -- Big Endian

...

53

- TGA (Targa) -- Little Endian TIFF -- Both, Endian identifier encoded into file
- WPG (WordPerfect Graphics Metafile) -- Big Endian (on a PC!) XWD (X Window Dump) -- Both, Endian identifier encoded into file

## **Addressing Modes**

- Immediate
- Direct
- Indirect
- Register
- · Register Indirect
- Displacement (Indexed)
- Stack

## Addressing Modes and Formats

## **Immediate Addressing**

- · Operand is part of instruction
- Operand = address field
- e.g. ADD 5
  - Add 5 to contents of accumulator
  - Here 5 is operand
- No memory reference to fetch data
- Fast
- Limited range

| Tange  | Instruction |
|--------|-------------|
| Opcode | Operand     |

## **Direct Addressing**

- · Address field contains address of operand
- Effective address (EA) = address field (A)
- e.g. ADD A
   Add contents of cell A to accumulator
   Look in memory at address A for operand
- Single memory reference to access data
- No additional calculations to work out effective address
- · Limited address space

## **Direct Addressing Diagram**



## **Indirect Addressing Diagram**



# **Indirect Addressing** (1)

- Memory cell pointed to by address field contains the address of (pointer to) the operand
- Large address space
- 2<sup>n</sup> where n = word length
- · May be nested, multilevel, cascaded
- · Multiple memory accesses to find operand
- · Hence slower

## **Register Addressing (1)**

- · Operand is held in register named in address field
- $\mathbf{E}\mathbf{A} = \mathbf{R}$

- · Limited number of registers
- Very small address field needed
   Shorter instructions
  - Faster instruction fetch
- No memory access
- Very fast execution
- · Very limited address space
- Multiple registers helps performance
   Requires good assembly programming or compiler writing

## **Register Addressing Diagram**



## **Register Indirect Addressing**

- Operand is held in memory cell pointed to by contents of register R named in address field
- EA = (R)

61

- Large address space (2<sup>n</sup>)
- One fewer memory access than indirect addressing

#### **Register Indirect Addressing Diagram**



## **Displacement Addressing**

62

64

- EA = A + (R)
- · Address field hold two values
  - -A = base value
  - -R = register that holds displacement
  - or vice versa

## **Displacement Addressing Diagram**



# **Relative Addressing**

- · A version of displacement addressing
- R = Program counter (PC)
- $\mathbf{EA} = \mathbf{A} + (\mathbf{PC})$

65

• i.e. get operand from A cells from current location pointed to by PC

## **Base-Register Addressing**

- A holds displacement
- R holds pointer to base address
- **R** may be explicit or implicit
- e.g. segment registers in 80x86

## **Indexed Addressing**

- $\mathbf{A} = \mathbf{base}$
- **R** = displacement
- $\mathbf{E}\mathbf{A} = \mathbf{A} + \mathbf{R}$
- Good for accessing arrays
  - -EA = A + R-R++

67

69

71

# Combinations

- Postindex
- EA = (A) + (R)
- Preindex
- EA = (A+(R))

## **Stack Addressing**

68

- Operand is (implicitly) on top of stack
- e.g.
   ADD Pop top two items from stack and add

#### Summary of basic addressing modes

| Mode              | Algorithm         | Principal Advantage | Principal Disadvantage     |
|-------------------|-------------------|---------------------|----------------------------|
| Immediate         | Operand = A       | No memory reference | Limited operand magnitude  |
| Direct            | EA = A            | Simple              | Limited address space      |
| Indirect          | EA = (A)          | Large address space | Multiple memory references |
| Register          | EA = R            | No memory reference | Limited address space      |
| Register indirect | EA = (R)          | Large address space | Extra memory reference     |
| Displacement      | EA = A + (R)      | Flexibility         | Complexity                 |
| Stack             | EA = top of stack | No memory reference | Limited applicability      |

## **Instruction Formats**

- Layout of bits in an instruction
- Includes opcode
- Includes (implicit or explicit) operand(s)
- Usually more than one instruction format in an instruction set

# **Instruction Length**

- Affected by and affects:
  - Memory size
  - Memory organization
  - Bus structure
  - CPU complexity
  - CPU speed
- Trade off between powerful instruction repertoire and saving space

# **Allocation of Bits**

- Number of addressing modes
- Number of operands
- Register versus memory
- Number of register sets
- Address range

73

75

• Address granularity