References: EE380 Parallel Processing
The book places emphasis on shared memory multiprocessors (SMP
stuff) and cache coherence issues; read the book for that.
However, you also should be aware of SIMD (SWAR) and MIMD;
Cluster, Farm, Warehouse Scale Computer, Grid, and Cloud;
Latency, Bandwidth, and Bisection Bandwidth; network topologies
including Direct Connections (the book calls these "fully
connected"), Toroidal Hyper-Meshes (e.g., Rings, Hypercubes),
Trees, Fat Trees, and Flat Neighborhood Networks (FNNs); Hubs,
Switches, and Routers. The Spring 2020 slides I used for the
material on high-end "supercomputer" architecture are posted as
a PDF. These slides give a
nice overview of cluster supercomputing (including the
terminology) and also very briefly discuss GPUs. The talk
presenting these slides is here, and
includes a short virtual tour of Prof. Dietz's lab and
supercomputing facilities in the Marksbury building.
You will find a lot of information about high-end parallel
processing at aggregate.org. Professor Dietz and the University
of Kentucky are leaders in this field, so Dietz has writen quite
a few documents that explain all aspects of this technology.
One good, but very old, overview is the Linux Documentation Project's Parallel Processing
HOWTO; a particularly good overview of network topologies
appears in this paper describing FNNs.
A quick summary of what things look like in Spring 2019:
-
Nearly all desktop/laptop processors are pipelined, superscalar,
SWAR, implementations with 2-32 cores on each processor chip;
Intel's Xeon Phi processors, with up to about 60 cores per chip and
512-bit SWAR, have been discontinued, but AMD is back in the
game with a 32-core chip that looks very strong
-
Nearly all supercomputers are clusters and, since Fall 2017,
virtually all 500 of the Top500 supercomputers are Linux clusters; also note
that Asia/China now dominate the list, with 55.2% of
the systems in November 2018
-
GPUs are appearing everywhere (although the HW/SW technology for
them is still evolving); NVidia GPUs have come to dominate the
high-performance computing market
-
The slow transition to integrating GPUs on the processor chip
continues, as does the transition from IA32/AMD64 to ARM64 and
there are ARM64 machines on the latest Top500 list
-
Clouds are a very popular way to handle applications that need
lots of memory/storage, but not so much processing resource;
there is a particularly strong push for software as a
service with cloud subscriptions rather than software
purchases
-
IoT (Internet of Things), the idea that everything should be
connected, continues to develop, with various societal issues
ranging from simple privacy and ownership rights issues to
potentially life-threatening things like "car hacking"
-
Quantum computing has become a very intense research focus,
but it still isn't clear it will ever be practical
One last note:
Tesla's Full Self Driving Chip is a great example of supercomputing moving into mass-market devices
Computer Organization and Design.