This paper describes a clever method for handling nested tracking of nested SIMD enable/disable without use of a bit stack.
@inproceedings{ keryell93activity,
author = "Roman Keryell and Nicolas Paris",
title = "Activity Counter: New Optimization for the Dynamic Scheduling of {SIMD} Control",
booktitle = "Proceedings of the 1993 International Conference on Parallel Processing",
volume = "II - Software",
publisher = "CRC Press",
address = "Boca Raton, FL",
pages = "II--184--II--187",
year = "1993",
url = "citeseer.ist.psu.edu/keryell93activity.html" }
This paper describes the ICL DAP, another early SIMD machine.
@inproceedings{803971,
author = {S. F. Reddaway},
title = {a distributed array processor},
booktitle = {ISCA '73: Proceedings of the 1st annual symposium on Computer architecture},
year = {1973},
pages = {61--65},
doi = {http://doi.acm.org/10.1145/800123.803971},
publisher = {ACM Press},
address = {New York, NY, USA},
}
This paper describes Ken Batcher's SIMD MPP design at Goodyear Aerospace.
@inproceedings{285977,
author = {Kenneth E. Batcher},
title = {Architecture of a massively parallel processor},
booktitle = {ISCA '98: 25 years of the international symposia on Computer architecture (selected papers)},
year = {1998},
isbn = {1-58113-058-9},
pages = {174--179},
location = {Barcelona, Spain},
doi = {http://doi.acm.org/10.1145/285930.285977},
publisher = {ACM Press},
address = {New York, NY, USA},
}
A (relatively late) version of the "Connection Machine Model CM-2 Technical Summary, Version 6.0, November 1990." This includes description of the (CM-200) floating-point hardware to the design.
The SWAR slides I used in class... originally from a talk given in February 1997 at Purdue University.
One of the best generic descriptions of the concepts of SWAR. The above link is direct from Springer-Verlag.
@inproceedings{663771,
author = {Randall J. Fisher and Henry G. Dietz},
title = {Compiling for SIMD Within a Register},
booktitle = {LCPC '98: Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing},
year = {1999},
isbn = {3-540-66426-2},
pages = {290--304},
publisher = {Springer-Verlag},
address = {London, UK},
}
This site contains a variety of news, paper links, etc., about use of GPUs (Graphic Processing Units) for General-Purpose computing -- commonly known as GPGPU. Note that general-purpose is a misnomer; it is really about programming GPUs for tasks that are not entirely graphical.
The first paper on ATI's CTM (Close To the Metal) software interface to GPUs (Graphics Processing Units) for general-purpose computing. Referenced directly from ATI's site, which is now part of AMD's site. There are also slides and a full manual at the ATI/AMD site.
The first paper on nanocontrollers -- bit-serial SIMD-style hardware for use in control of massively parallel arrays of sensors, actuators, and other devices.
@article{359336,
author = {Richard M. Russell},
title = {The CRAY-1 computer system},
journal = {Commun. ACM},
volume = {21},
number = {1},
year = {1978},
issn = {0001-0782},
pages = {63--72},
doi = {http://doi.acm.org/10.1145/359327.359336},
publisher = {ACM Press},
address = {New York, NY, USA},
}