SIMD References: EE686 Advanced Computer Architecture Design

Activity Counter Implementation Of Enable Logic (PDF)

This paper describes a clever method for handling nested tracking of nested SIMD enable/disable without use of a bit stack.

@inproceedings{ keryell93activity,
    author = "Roman Keryell and Nicolas Paris",
    title = "Activity Counter: New Optimization for the Dynamic Scheduling of {SIMD} Control",
    booktitle = "Proceedings of the 1993 International Conference on Parallel Processing",
    volume = "II - Software",
    publisher = "CRC Press",
    address = "Boca Raton, FL",
    pages = "II--184--II--187",
    year = "1993",
    url = "citeseer.ist.psu.edu/keryell93activity.html" }

DAP -- a distributed array processor (PDF)

This paper describes the ICL DAP, another early SIMD machine.

@inproceedings{803971,
 author = {S. F. Reddaway},
 title = {a distributed array processor},
 booktitle = {ISCA '73: Proceedings of the 1st annual symposium on Computer architecture},
 year = {1973},
 pages = {61--65},
 doi = {http://doi.acm.org/10.1145/800123.803971},
 publisher = {ACM Press},
 address = {New York, NY, USA},
 }

Architecture of a massively parallel processor (PDF)

This paper describes Ken Batcher's SIMD MPP design at Goodyear Aerospace.

@inproceedings{285977,
 author = {Kenneth E. Batcher},
 title = {Architecture of a massively parallel processor},
 booktitle = {ISCA '98: 25 years of the international symposia on Computer architecture (selected papers)},
 year = {1998},
 isbn = {1-58113-058-9},
 pages = {174--179},
 location = {Barcelona, Spain},
 doi = {http://doi.acm.org/10.1145/285930.285977},
 publisher = {ACM Press},
 address = {New York, NY, USA},
 }

Thinking Machines CM-2 (PDF)

A (relatively late) version of the "Connection Machine Model CM-2 Technical Summary, Version 6.0, November 1990." This includes description of the (CM-200) floating-point hardware to the design.

Multimedia Extensions For Microprocessors: SIMD Within A Register (HTML)

The SWAR slides I used in class... originally from a talk given in February 1997 at Purdue University.

Compiling for SIMD within a Register (PDF)

One of the best generic descriptions of the concepts of SWAR. The above link is direct from Springer-Verlag.

@inproceedings{663771,
 author = {Randall J. Fisher and Henry G. Dietz},
 title = {Compiling for SIMD Within a Register},
 booktitle = {LCPC '98: Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing},
 year = {1999},
 isbn = {3-540-66426-2},
 pages = {290--304},
 publisher = {Springer-Verlag},
 address = {London, UK},
 }

GPGPU (HTML)

This site contains a variety of news, paper links, etc., about use of GPUs (Graphic Processing Units) for General-Purpose computing -- commonly known as GPGPU. Note that general-purpose is a misnomer; it is really about programming GPUs for tasks that are not entirely graphical.

A Performance-Oriented Data Parallel Virtual Machine for GPUs (PDF)

The first paper on ATI's CTM (Close To the Metal) software interface to GPUs (Graphics Processing Units) for general-purpose computing. Referenced directly from ATI's site, which is now part of AMD's site. There are also slides and a full manual at the ATI/AMD site.

Nanocontrollers (PDF)

The first paper on nanocontrollers -- bit-serial SIMD-style hardware for use in control of massively parallel arrays of sensors, actuators, and other devices.

The Cray-1 (PDF)

@article{359336,
 author = {Richard M. Russell},
 title = {The CRAY-1 computer system},
 journal = {Commun. ACM},
 volume = {21},
 number = {1},
 year = {1978},
 issn = {0001-0782},
 pages = {63--72},
 doi = {http://doi.acm.org/10.1145/359327.359336},
 publisher = {ACM Press},
 address = {New York, NY, USA},
 }


EE686 Advanced Computer Architecture Design.