SIMD (Single Instruction stream, Multiple Data stream) parallel processing has long been used to speed up image processing and other multimedia operations. However, SIMD was usually implemented using large numbers of custom processing elements. With the importance of multimedia growing rapidly, it is a natural step to extend processor instruction sets with some form of SIMD multimedia support.
Generically, SWAR (SIMD Within A Register) is implemented by partitioning the k-bit registers, data paths, and function units of a conventional processor into n k/n-bit fields that can be processed using SIMD-parallel instructions. Ordinary instructions can be used, but special "SIMD partitioned" instructions will often yield better performance. AMD, Cyrix, and Intel have MMX (MultiMedia eXtensions); Digital Alpha has MAX (MultimediA eXtensions); Hewlett-Packard PA-RISC has MAX (Multimedia Acceleration eXtensions); Sun SPARC V9 has VIS (Visual Instruction Set).
This talk will briefly overview all the above SWAR models... and how SWAR can be made into a viable target for data-parallel high-level language compilers.
An introduction to SWAR is available online at
http://dynamo.ecn.purdue.edu/~hankd/SWAR/