









# Multiple instruction, multiple data stream- MIMD

- Differences in processor communication
- Symmetric Multiprocessor (SMP)
  - Tightly coupled communication via shared memory
  - Share single memory or pool, shared bus to access memory
  - Memory access time of a given memory location is approximately the same for each processor
- Non-uniform memory access (NUMA)
  - Tightly coupled communication via shared memory
  - Access times to different regions of memory may differ
- Clusters
  - Loosely coupled no shared memory
  - Communication via fixed path or network connections
  - Collection of independent uniprocessors or SMPs

Computer Organization II, Autumn 2010, Teemu Kerola

.12.2010



# SMP - Symmetric Multiprocessor

- Two or more similar processors of comparable capacity
- All processors can perform the same functions (hence symmetric)
- Connected by a bus or other internal connection
- Share same memory and I/O
- I/O access to same devices through same or different channels
- Memory access time is approximately the same for each processor
- System controlled by integrated operating system
  - providing interaction between processors
  - Interaction at job, task, file and data element levels

Computer Organization II, Autumn 2010, Teemu Kerola

1.12.2010

















# Cache and Data Consistency

- Multiple processors with their own caches
  - Multiple copies of same data in different caches
  - Concurrent modification of the same data
- Could result in an inconsistent view of memory
  - Inconsistency the values in caches are different
- Write back policy
  - Write first to local cache and only later to memory
- Write through policy
  - The value is written to memory when changed
  - Other caches must monitor memory traffic
- Solution: maintain cache coherence
  - Keep recently used variables in appropriate cache(s), while maintaining the consistency of shared variables!

Computer Organization II, Autumn 2010, Teemu Kerola



# **Software Solutions for Cache Coherence**

- Compiler and operating system deal with problem
- Overhead transferred to compile time
- Design complexity transferred from hardware to software
- However, software tends to make conservative decisions
  - Inefficient cache utilization do not cache shared variables
- Analyze code to determine safe periods for caching shared variables

Computer Organization II, Autumn 2010, Teemu Kero

12.2010



#### **Hardware Solutions for Cache Coherence**

- Dynamic recognition of potential problems at run time
- More efficient use of cache, transparent to programmer
- Directory protocols
  - Collect and maintain information about copies of data in cache
  - Directory stored in main memory
  - Requests are checked against directory
  - Creates central bottleneck
  - Effective in large scale systems with complex interconnections
- Snoopy protocols
  - Distribute cache coherence responsibility to all cache controllers
  - Cache recognizes that a line is shared
  - Updates announced to other caches
  - Suited to bus based multiprocessor

Computer Organization II, Autumn 2010, Teemu Kerola

1 12 2010



# **Snoopy Cache Protocols**

- Write-Invalidate
  - Multiple readers, one writer
  - Write request invalidates that line in all other caches
  - Writing processor gains exclusive (cheap) access until line required by another processor
  - Used in Pentium II and PowerPC systems
  - State of every line marked as modified, exclusive, shared or invalid (MESI)
- Write-Update
  - Multiple readers and writers
  - Updated word is distributed to all other processors
- Some systems use an adaptive mixture of both solutions

Computer Organization II, Autumn 2010, Teemu Kerola

1.12.2010 1



### **MESI Protocol**

- Four states (two bits per tag)
  - Modified: modified cache line (only in this cache)
  - Exclusive: only in this cache, but the same as memory
  - Shared: same as memory, may be in other caches
  - Invalid: line does not contain valid data

|                               | M<br>Modified         | E<br>Exclusive        | S<br>Shared                      | I<br>Invalid            |
|-------------------------------|-----------------------|-----------------------|----------------------------------|-------------------------|
|                               |                       |                       |                                  |                         |
| This cache line<br>valid?     | Yes                   | Yes                   | Yes                              | No                      |
| The memory copy is            | out of date           | valid                 | valid                            | -                       |
| Copies exist in other caches? | No                    | No                    | Maybe                            | Maybe                   |
| A write to this line          | does not go to<br>bus | does not go to<br>bus | goes to bus and<br>updates cache | goes directly to<br>bus |































