Note as of April 27, 2017: This project has been stalled for some time due to lack of funding. Full source code is still freely available from https://github.com/aggregate/MOG/ for a version that basically works, but is about 6-12 months of effort away from being ready for production use.
Back in the early 1990s, we developed a variety of techniques that allow arbitrary MIMD programs to execute on SIMD hardware with reasonable efficiency. At the time, we viewed these methods primarily as novel ways to obtain some additional functionality from a SIMD supercomputer -- literally allowing us to program our 16,384 processing element MasPar MP-1 using a shared memory MIMD model while achieving a significant fraction of peak native (distributed memory SIMD) performance. However, we now realize that this approach is much more important. The first sign of this new importance was our development a decade later of "Kentucky Architecture" nanocontrollers, which leverage MSC (Meta State Conversion) to greatly simplify the hardware per fully-programmable processor. In our research exhibit at SC08, we have introduced a MIMD-on-SIMD technology which we believe may be even more immediately significant: MOG (MIMD On GPU).
GPUs, the Graphics Processing Units on high-end video cards, have been talked about for years as offering outstanding price/performance iff you can make your application fit the highly restrictive, but very scalable, computation model that GPUs use. Unfortunately, very little fits that model and there is virtually no existing code base -- in fact, there isn't even standardization of the programming model across NVIDIA and ATI hardware (not to mention Intel's upcoming entry). MOG is an answer. Why?
As of November 2010, there is a VERY ROUGH alpha test release version of MOG freely available as source code (and required .deb packages) at http://aggregate.org/MOG/20101122.tgz. This version, 20101122, does not have any MPI support built-in, but uses a new instruction set that allows code from compilers targeting MIPSEL to be converted, so it supports C, Fortran, etc. using the standard GCC compilers. There isn't yet any decent documentation, but there is a README and there's a 2009 paper, MIMD Interpretation On A GPU, that explains the concepts quite well.
When we first developed the concept of MIMD execution on SIMD hardware, people didn't believe we could actually do it -- or they believed that we only could do it because of quirks in the MasPar MP1 hardware. No, it really does work in the general case. Well, we finally came-up with an intuitive proof, and it doesn't even use a computer. Click on the following for a ~50MB AVI video showing the MIMD-on-SIMD maze in action:
Tilting the maze in a particular direction is equivalent to executing a "move in that direction" instruction. A ball hitting a wall, rather than moving, is analogous to a SIMD processing element being disabled.
MOG is too new to have appeared in publications, but there are a few of our older publications which are closely related:
Related work by others: