References: EE380 Parallel Processing

The book places emphasis on shared memory multiprocessors (SMP stuff) and cache coherence issues; read the book for that.

However, you also should be aware of SIMD (SWAR) and MIMD; Cluster, Farm, Warehouse Scale Computer, Grid, and Cloud; Latency, Bandwidth, and Bisection Bandwidth; network topologies including Direct Connections (the book calls these "fully connected"), Toroidal Hyper-Meshes (e.g., Rings, Hypercubes), Trees, Fat Trees, and Flat Neighborhood Networks (FNNs); Hubs, Switches, and Routers. The Spring 2020 slides I used for the material on high-end "supercomputer" architecture are posted as a PDF. These slides give a nice overview of cluster supercomputing (including the terminology) and also very briefly discuss GPUs. The talk presenting these slides is here, and includes a short virtual tour of Prof. Dietz's lab and supercomputing facilities in the Marksbury building.

You will find a lot of information about high-end parallel processing at aggregate.org. Professor Dietz and the University of Kentucky are leaders in this field, so Dietz has writen quite a few documents that explain all aspects of this technology. One good, but very old, overview is the Linux Documentation Project's Parallel Processing HOWTO; a particularly good overview of network topologies appears in this paper describing FNNs.

A quick summary of what things look like in Spring 2019:

One last note: Tesla's Full Self Driving Chip is a great example of supercomputing moving into mass-market devices


EE380 Computer Organization and Design.