





#### Exam Wed 3.3. at 16.00 in auditorium A111

- 2,5 hours three or four questions
- You can write on all answers on the same paper using pencil or pen
- There is no need for a calculator, but a simple one is allowed
  - If there is math needed, you can just write the formula and you do not need to write the result number without a calculator

Computer Organization II, Spring 2010, Tiina Niklande

25.2.2010



#### For the exam

- Go through the exercises
- Read the book and lecture slides
  - If there is nothing on the slides about the subsection, then there very probably is not a question in the exam
- The review questions in the slides are good hints!
- You can look for the collection of questions from the 2006 course. Teemu Kerola has collected several years of questions there
  - Direct link to the collection http://www.cs.helsinki.fi/u/kerola/tikra/kokeet/

Computer Organization II, Spring 2010, Tiina Niklander

5.2.2010

.2010



# Lecture 1: Part 1 Overview (Ch 1-2 + 1-8) and Chapter 20 Digital logic

- Overview
  - No questions focusing only on this, but the content may be needed to understand future Chapters
- Chapter 7 I/O (7.1. 7.5)
  - MUST KNOW: memory mapped I/O, interrupt-driven I/O, DMA
  - (covered in earlier course, but still valid)
- Digital logic
  - Boolean algebra, gates and flip-flops
  - No optimization, no Carnaugh maps
  - MUST KNOW:

7th ed, 2006: Appendix B: Digital logic

- from Boolean tables to gates
- Flip-flops and basic circuits basic functionality

Computer Organization II, Spring 2010, Tiina Niklander











## Lecture 2: Bus, Chapter 3

- Sections 3.1 3.3 part of lecture 1
  - Needed to understand the other sections
  - MUST KNOW: Instruction cycle, interrupts
- Sections 3.4 and 3.5: Bus and PCI
  - MUST KNOW: terms like speed, width, timing, signaling, arbitration
  - MUST KNOW: PCI read, PCI write sequences

Computer Organization II, Spring 2010, Tiina Niklander

2.2010





## **Bus characteristics**

- Timing (*ajoitus*, tahdistus)
  - Synchronous (tahdistettu)
    - Regular clock cycle (kellopulssi) sequence of 0s and 1s
  - Asynchronous
    - Separate signals when needed
  - Shared traffic rules everyone knows what is going to happen next
- Efficiency (tehokkuus)
  - Bandwidth (kaistanleveys)
    - How many bits per second

Computer Organization II, Spring 2010, Tiina Niklander

2.2010







#### Packet-switched PCI Express (PCIe, PCI-E)

- PCI bus is too slow for some devices
- Replaces PCI bus (and possibly other I/O-bus)
  - Already available on new computers
- Hub on motherboard acting as a crossbar switch (*kytkin*)
- Based on point-to-point connections (*kaksipisteyhteys*)
  - Full-dublex, one lane has two lines (one send, one receive)
  - One device can used one or more (2,4,8,16,32) lanes
- Data stream (serial transfer)
  - Small packets (header + payload), bits in sequence
- No reservation, no control signals.
  - Each device may send at any time, when it wishes
  - Packet header contains the control information (like target)
- Data rate on one lane 250MB/s (future 3rd gen: 1GB/s)

Computer Organization II, Spring 2010, Tiina Niklander

2.2010





#### Lecture 3: Cache and memory, Chapters 4 & 5

#### ■ Cache

- MUST KNOW: all content, like cache organization, cache usage, access, write policies,
- Mapping: Direct mapping, fully-associative, set-associative

#### Memory

- The most interesting part of memory section, 5.2. error correction, is NOT part of the course.
- Not that important chapter
- Chapter 6 external memory skip

Computer Organization II, Spring 2010, Tiina Niklander

10 1

16



## **Principle of locality**

- In any given time period memory references occur only to a small subset of the whole address space
- Temporal locality (ajallinen)
  - it is likely that a data item referenced a <u>short time ago</u> will be referenced <u>again</u> soon
- Spatial locality (alueellinen)
  - it is likely that a data items close to the one referenced a short time ago will be referenced soon

MEM:



Computer Organization II, Spring 2010, Tiina Niklande

25.2.2010



#### **Cache Design**

| Cache Size       | Write Policy  |  |
|------------------|---------------|--|
| Mapping Function | Write through |  |
| Direct           | Write back    |  |
| Associative      | Write once    |  |

Associative Write
Set Associative Line Size

Replacement Algorithm

Least recently used (LRU)

First in first out (FIFO)

Number of caches

Single or two level
Unified or split

Least frequently used (LFU)

Random

- Cache Size & Line Size
  - Many blocks help for temporal locality
  - Large blocks help for spatial locality
  - Larger cache is slower
  - Multi-level cache

Computer Organization II, Spring 2010, Tiina Niklander

#### Typical sizes:

L1: 8 KB - 64 KB

L2: 256 KB - 8 MB

L3: 2 MB - 48 MB

(Sta09Table 4.3)

25.2.2010 1











# Lecture 4 Memory management, Chapters 8.3 – 8.6

- Memory management
  - MUST KNOW: virtual memory organization, page table, address translation, TLB, hierarchical page table like Pentium and ARM, combining paging, TLB and cache

7th ed, 2006: 8.4 PowerPC (instead of ARM)

Computer Organization II, Spring 2010, Tiina Niklander

25.2.2010

23













## Lecture 5: Computer arithmetic, Chapter 9

- Integer representation
  - MUST KNOW: sign-magnitude and twos complement, how to convert for different bit length
- Integer arithmetic
  - MUST KNOW: add, subtract, multiply, divide, Booth algorithm
- Floating-point representation
  - MUST KNOW: IEEE Standard,
- Floating-point arithmetic
  - MUST KNOW: over and under flow, general principles for calculations with floating points (not a detailed algorithm)

Computer Organization II, Spring 2010, Tiina Niklander

25.2.2010

**Twos complement**  $-57 = 1100\ 0111$ ■ 1: invert all bits 0011 1000 ■ 2: add 1 ■ 3: Special cases 0011 1001 ■ Ignore carry bit (ylivuotobitti) = 57■ Sign really changed? - Cannot negate smallest negative - Result in exception  $-128 = 1000\ 0000$ ■ Simple hardware 0111 1111 ■ Easy to expand. As a 16-bit sequence 1000 0000 57 = 0011 1001 = 0000 0000 0011 1001 sign  $-57 = \underline{1}100\ 0111 = \underline{1111\ 1111}\ \underline{1}100\ 0111$ extension Computer Organization II, Spring 2010, Tiina Niklander 25.2.2010









# Floating point arithmetics

| Floating Point Numbers                                    | Arithmetic Operations                                                                                                                                                                         |  |
|-----------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| $X = X_{s} \times B^{X_{E}}$ $Y = Y_{s} \times B^{Y_{E}}$ | $X + Y = \left(X_s \times B^{X_E - Y_E} + Y_s\right) \times B^{Y_E}$ $X - Y = \left(X_s \times B^{X_E - Y_E} - Y_s\right) \times B^{Y_E}$ $X = X_s \times B^{X_E - Y_E} + X_s \times B^{X_E}$ |  |
|                                                           | $X \times Y = (X_s \times Y_s) \times B^{X_E + Y_E}$                                                                                                                                          |  |
|                                                           | $\frac{X}{Y} = \left(\frac{X_s}{Y_s}\right) \times B^{X_E - Y_E}$                                                                                                                             |  |

$$X = 0.3 \times 10^2 = 30$$
  
 $Y = 0.2 \times 10^3 = 200$ 

(Sta06 Table 9.5)

$$X + Y = (0.3 \times 10^{2-3} + 0.2) \times 10^3 = 0.23 \times 10^3 = 230$$

$$X - Y = (0.3 \times 10^{2-3} \bigcirc 0.2) \times 10^3 = (-0.17) \times 10^3 = -170$$

$$X \times Y = (0.3 \times 0.2) \times 10^{2+3} = 0.06 \times 10^5 = 6000$$

$$X \div Y = (0.3 \oplus 0.2) \times 10^{2-3} = 1.5 \times 10^{-1} = 0.15$$

Computer Organization II, Spring 2010, Tiina Niklander



## Lecture 6: Instruction sets, Chapters 10 & 11

- MUST KNOW: everything about the instruction structure, representation, data types, addressing, instruction formats, also Pentium and ARM
- Specific instruction functionalities covered in earlier course. You need to know enough to be able to handle example 'programs' as in the exercises.
  - So no need to memorize specific instruction types and their mnemonic representations!

7th ed, 2006: PowerPC instead of ARM You may need to read the IA-64 Predication from 15.3 for the conditional execution of instructions

Computer Organization II, Spring 2010, Tiina Niklander

25.2.2010



#### **Addressing modes**

#### (Sta06 Table 11.1)

| Mode              | Algorithm         | Principal Advantage | Principal Disadvantage     |
|-------------------|-------------------|---------------------|----------------------------|
| Immediate         | Operand = A       | No memory reference | Limited operand magnitude  |
| Direct            | EA = A            | Simple              | Limited address space      |
| Indirect          | EA = (A)          | Large address space | Multiple memory references |
| Register          | Operand = (R)     | No memory reference | Limited address space      |
| Register indirect | EA = (R)          | Large address space | Extra memory reference     |
| Displacement      | EA = A + (R)      | Flexibility         | Complexity                 |
| Stack             | EA = top of stack | No memory reference | Limited applicability      |

- EA = Effective Address
- (A) = content of memory location A
- (R) = content of register R
- One register for the top-most stack item's address
- Register (or two) for the top stack item (or two)

Computer Organization II, Spring 2010, Tiina Niklander

25.2.2010 3











# Lecture 7&8: Cpu structure and function, Chapter 12

- MUST KNOW: Everything, but not the tiny details of processors.
- Most important issues:
  - Instruction cycle details
  - Hazards, dependencies
  - Branching and pipelines
  - Register organization (different register types)
  - Typical program status word (PSW)

7th ed, 2006: 12.6 PowerPC (instead of ARM)

Computer Organization II, Spring 2010, Tiina Niklander

25.2.2010 4













## **Data dependency**

- Read after Write (RAW) (a.k.a true or flow dependency)
  - Occurs if succeeding read takes place before the preceeding write operation is complete
- Write after Read (WAR) (a.k.a antidependency)
  - Occurs if the succeeding write operation completes before the preceeding read operation takes place
- Write after Write (WAW) (a.k.a output dependency)
  - Occurs when the two write operations take place in the reversed order of the intended sequence
- The WAR and WAW are possible only in architectures where the instructions can finish in different order

Computer Organization II, Spring 2010, Tiina Niklander

25.2.2010



## **Dealing with branches**

- Delayed branch
- Multiple instruction streams
  - Speculative execution
- Prefetch branch target
- Loop buffer
- Branch prediction
  - Static: always taken vs. never taken
  - Dynamic: based on Branch History Table



Faken

Computer Organization II. Spring 2010, Tijna Niklander



## Lecture 8: RISC, Chapter 13

- MUST KNOW: Everything, but 13.6 MIPS, from 13.7 Sparc only the register set is needed
- RISC vs CISC
- Load/Store architecture
- RISC pipelining
- Register windows, register optimization

Computer Organization II, Spring 2010, Tiina Niklander







## Lecture 9: Superscalar, Chapter 14

- MUST KNOW: Everything, but the tiny details of the processors
- In-order / out-of-order issue / complete
- Instruction selection window
- Register renaming

7th ed, 2006: 14.4 PowerPC (instead of ARM)

Computer Organization II, Spring 2010, Tiina Niklander





# Register renaming (rekistereiden uudelleennimeäminen)

- One cause for some of the dependencies is the usage of names
  - The same name could be used for several independent elements
  - Thus, instructions have unneeded write and antidependencies
  - Causing unnecessary waits
- Solution: Register renaming
  - Hardware must have more registers (than visible to the programmer and compiler)
  - Hardware allocates new real registers during execution in order to avoid name-based dependencies (nimiriippuvuus)
- Need
  - More internal registers (register files, register set),
     e.g. Pentium II has 40 working registers
  - Hardware that is cabable of allocating and managing registers and performing the needed mapping

Computer Organization II, Spring 2010, Tiina Niklander

25.2.2010

 $R3 \leftarrow R3 + R5$ 

 $R4 \leftarrow R3 + 1$   $R3 \leftarrow R5 + 1$ 

 $R7 \leftarrow R3 + R4$ 







# Lecture 10: Control-Unit, Chapters 15 & 16

- Chapter 15 Everything but the tiny details of processors
- Chapter 16: 16.1 16.3
- Micro-operation sequences in different phases of the execution cycle
- Control signals

7th ed, 2006: Chapters 16 & 17.1 -17.3

Computer Organization II, Spring 2010, Tiina Niklande









#### **Next microinstruction?**

- Selection normally based on flags
- Explicit: both addresses in the instruction
- Implicit: sequencially to next, bu 'jump target' in instruction
- Variable format; sepate jump instructions use the bits for address, signal instruction use the same bits for signals
- Address generation during execution:
  - Address combined directly from op-code and flags
- Subroutines and residual control: possibility to store one return address

25.2.2010



## Lecture 11: Parallel processing and multicore **Chapters 17 & 18**

- Chapters 17.1. 17.6. in exam
- Chapter 18.3. multicore organization might be in exam
- Most important: cache coherence and MESI
- Other issues: SMP, NUMA and Clusters

7th ed, 2006: Chapter 18 Parallel Processing, Multicore organization not in the book

Computer Organization II, Spring 2010, Tiina Niklander



# **Example exam questions**

- Available from 2006 course page:
- <a href="http://www.cs.helsinki.fi/u/kerola/tikra/kokeet/">http://www.cs.helsinki.fi/u/kerola/tikra/kokeet/</a>
- Page contains earlier exams, but a lot of them are only in Finnish because very few international students at that time.
- Kk is a course exam, ek separate exam,
- If the name has **e** or **en** in the end, the questions are in English

Computer Organization II, Spring 2010, Tiina Niklande



