20101122 alpha test release This is a ***VERY*** preliminary alpha test release of the MIPS-based MOG. We do not suggest that it is useful for anything but getting a feel for how the system works, and THIS VERSION SHOULD NOT BE REDISTRIBUTED, but it will soon be followed by a useful version. The original MOG versions were all based on a stack instruction set, which proved too difficult for other compilers to target. There were about 70K lines of source code in that system -- and about two dozen different versions trying a wide variety of optimization methods. Some of the best of the methods tried are reported in our LCPC 2009 paper: http://www.springerlink.com/content/x3k8k3uw7j42j553/ For now, that's the reference to cite. The best of those methods is essentially what the new system implements; a series of single-instruction interpreters using an ordering that is automatically optimized for each application. However, unlike the version reported in the LCPC paper, this release uses static analysis rather than a runtime trace to collect instruction frequency and pairwise adjacency. The optimizer is greatly simplified too. Like the earlier versions, the MOG interpreter itself is quite simple, complicated mostly by a 3-dimensional global memory layout that ensures all memory references are free of bank conflicts while avoiding the integer multiply that the obvious 2D layout would have needed for the address computations. The new MOG instruction set is accumulator based with general registers. This is a very effective model, allowing the accumulator to reside in an actual GPU register and dramatically simplifying instruction decoding. The "registers" are actually stored in local "shared" memory, again with a layout that ensures freedom from banking conflicts, and the compiler's effective use of registers thus produces good explicit management of the very scarce GPU local memory. The system currently does not support double-precision floats, because we strongly believe it is a mistake to use them (due to a memory layout that breaks the banking properties). For several years, we have had libraries that facilitate use of pairs of native floats instead. There are hooks for system calls, but only exit() is really there. INSTALLING THE SYSTEM Well, there's an AMD64 binary, mog, that traces execution of the dumb fact.c code. You may be able to run that.... The real test requires: 1. Installing the .deb packages so that you have a working GCC cross compilation environment for MIPSEL. They "just work" under Ubuntu, and that's why we're using such an old version. 2. Compiling mogas. Just cc mogas.c -o mogas. 3. Installing a mog directory under ~/NVIDIA_GPU_Computing_SDK/C/src/ and building using the usual CUDA development environment. Yup, you need all the usual CUDA stuff. An OpenCL version is coming. RUNNING THE SYSTEM Given all that, the program fact.c can be compiled into fact.as by: ./mogcc fact.c >fact.as This actually runs GCC for MIPSEL and then translates the MIPS code into our MOG instruction set using a series of sed/awk scripts. It's ugly, but GCC MIPSEL seems to have a lot of imaginary instructions in its output, so a better translator is postponed until we know what the full scope of instructions emitted by GCC MIPSEL is. There is a fair amount of stuff spewed to stderr while doing this. The next step is to run the assembler to generate the data files that will be compiled into the simulator. For that, simply: ./mogas mog.h The mog.h name is critical, because it gets compiled into the interpreter. To build the executable MOG, just type make. If the TRACE flag is set, you'll get a tracer; otherwise, you get execution that executes many instructions per kernel invocation. Simply run: ~/NVIDIA_GPU_Computing_SDK/C/bin/mog CREDIT (BLAME?) Except for the .deb files, this code was written by H. Dietz. It is free software, but should NOT be redistributed at this time. Thus, consider it as being copyrighted with that restriction for now, although we will soon change the status to public domain or an attribution-style copyright. We have no long-term intention of preventing anyone from using this system in any context, including incorporating it under commercial products. As for most of our freely available source code, we make no claims of usability, etc., for this system. USE IT AT YOUR OWN RISK. Later versions may have some level of support, but this 20101122 release should be seen as a "toy" to let people see what we're doing... so they know we're serious about MOG and that it really does work. The author's contact info is: Hank Dietz, Professor & James F. Hardymon Chair in Networking .-. University of Kentucky, Electrical & Computer Engineering Dept. .-. /v\ 453 F. Paul Anderson Tower /v\// \\ Lexington, KY 40506-0046 Parallel // \\ )\ Phone: (859) 257 4701 FAX: (859) 257 3092 Processing /( )\-^^ URL: http://aggregate.org/hankd/ using Linux ^^-^^ This software is posted at: http://aggregate.org/MOG/20101122/