Public Demonstrations of PAPERS

Thus far, there have been six public demonstrations (research exhibits) of PAPERS systems at conferences/workshops:

This document provides a brief overview of our exhibits.


Informal Demonstration at ICPP 1994

Our first public demonstration of PAPERS was at the International Conference on Parallel Processing in August 1994, but things were very informally arranged. Originally, we had hoped to demo a cluster as part of our presentation of the paper on the new barrier mechanism, but instead we were given permission to demo our PAPERS cluster off to one side during the wine and cheese party.

That cluster, shown above, consisted of Four IBM ValuePoint 486DX 33MHz running Linux connected by both the second TTL_PAPERS prototype and PAPERS1. We simply placed everything on a standard AV cart and wheeled it into the wine and chesse party. The party was noisy, so demonstrations were difficult, but this was the first full public demonstration of PAPERS.


Purdue Booth at IEEE/ACM Supercomputing 1994

Our first formal public demonstration of PAPERS was at the IEEE/ACM Supercomputing conference in November 1994.

Although none of us had previously arranged a formal research exhibit at a conference, we decided to apply for research exhibit space at Supercomputing to demonstrate several PAPERS clusters. Much to our surprise, we were awarded Research Booth R4, a full 20' by 20' space located next to the sign that introduced the research area of the exhibit floor. Prime real estate and a lot of it. After much perseveration (and several manufacturers breaking their promises of loaner machines and/or donations), we finally settled on showing the following four clusters.

Four DEC Alphas running OSF.
These machines were two 166MHz and two 233MHz units, all provided to us by DEC as loaners for the show. These machines were truly state of the art; the 233MHz machines were only announced a week before Supercomputing. Unfortunately, we only had them for two weeks before the show, and it wasn't easy to get OSF to let us directly access the parallel port. Just in time, we were able to get direct port access by compiling some of our code into the OSF kernel. TTL_PAPERS was used to connect the machines for the demonstrations, which were benchmarks of the PAPERS C library support routines (interactively selected from a TCL/TK menu). The 233MHz boxes are standing upright with the 166MHz ones horizontally on top of them, the TTL_PAPERS box is barely noticable sitting to the right of the montor.

Eight Intel 386 running Linux
Unlike the DEC machines, these 386DX 33MHz systems were not exactly state of the art; in fact, they were essentially discarded when Intel upgraded one of our undergraduate labs to 486DX2 66MHz machines. However, each of the 386 boxes has floating point hardware which we have measured at about 4MFLOPS, so it isn't quite as bad as it first sounds. The same is true of the $30 homemade wooden "rack mount" -- which recieved almost as much praise as our research work did. This cluster was connected using an 8-processor version of PAPERS that is similar to the November 1994 TTL_PAPERS, but essentially doubles the communication bandwidth. The monitor on the right was displaying continuous benchmark results for the complete PAPERS C library, showing both the time and variation in time for each operation.

Two Four-Machine IBM Clusters
The last two clusters demonstrated in our booth consisted of four machines each: a cluster of Four IBM PowerPC 601 80MHz running AIX to the left, and a cluster of Four IBM ValuePoint 486DX 33MHz running Linux to the right (the plant-stand on the near right held a display of the evolution of PAPERS through the first six prototypes). The PowerPC machines didn't perform very well and the parallel port access was very slow because we had to use an OS call, but these machines were pre-release prototypes given to us as loaners.... We connected them using one of the TTL_PAPERS units, and ran a very simple demonstration showing barrier synchronization time and variation in time. On the other hand, the ValuePoint machines are very "normal" 486 systems, and we took advantage of this by actually having two different versions of PAPERS simultaneously connecting the cluster, with a wide range of demonstrations available for each unit (partly because this was the cluster we demonstrated at ICPP). Despite this, most of the time we simply had the ValuePoint cluster run a SPMD program that plays multivoice music by assigning each new note to be played by a randomly selected machine -- using PAPERS to ensure that timing of the notes is preserved.


Informal Demonstration at LCPC 1995

Since we had a paper on the PAPERS library at the 8th International Workshop on Languages and Compilers for Parallel Computing, at Ohio State University, we decided it might be a nice idea to bring a little hardware. So, just after the last presentation ended on August 12, 1995, we set-up our hardware and gave a brief demonstration....

We didn't want to make a big deal out of this, so we did everything using a pair of 486DX2/66 laptops (running Linux) and the November 1994 version of the four-processor TTL_PAPERS. No, that library wasn't designed to work with just two machines connected to a four-processor unit, but we made the appropriate change to the check-in procedure so that we could run a simple barrier speed benchmark. It is strange to see the unit working with two cables dangling.

We also demonstrated a few things with the TTL_LIB_950614 library's four-processor VAPERS simulator on one of the laptops... while the TTL_PAPERS hardware demo continued to run across the laptops.

Not many people saw our little demo; anyway, it's quality, not quantity, that counts. ;-)


Informal Demonstration at ICPP 1995

For various reasons (including the ever-shrinking travel budget and the fact that I got back from LCPC 1995 just a day before), I was not expecting to attend ICPP this year... but I believe I've only missed one year since 1984, and 1995 wasn't it. The inspiration for me to drive up this time was a last-minute request that I fill-in for a missing panelist on "SPMD: on a collision course with portability?"

Of course, I couldn't resist the temptation to bring along the same two laptops and TTL_PAPERS unit that I had just demoed at LCPC 1995. As in 1994, I was given permission to demo the cluster off to the side during one of the evening parties; this time, it was the cocktail party. Not a very impressive demo; still, after two years at ICPP, people are getting used to the idea that PAPERS isn't a joke. Well, I did hear a few chuckles when people saw that the VAPERS simulator display not only duplicated the PAPERS unit's flashing lights, but also the wood grain....


Purdue Booth at IEEE/ACM Supercomputing 1995

Supercomputing 1994 had been a very positive experience for us and a lot of good exposure for Purdue and our work, so we wanted to do much the same for 1995. However, San Diego isn't a road trip like Washington D.C. was, so we had to cut back on the quantity of stuff being brought to the show. Thus, instead of a 20' by 20' booth with four clusters, we simply had a 10' by 20' booth (R22) with two clusters on one end and software displays on the other end.

The hardware side.
The hardware side of the booth displayed two clusters. One cluster was the rather familiar group of four IBM ValuePoint 486DX33 machines running Linux. The other four-machine cluster was built specifically to be easy to carry around for demonstrations. Named the "TTL_PAPERS Microcluster," it consists of a group of four Compaq Aero subnotebook computers and a miniature oak rack mount.

TTL_PAPERS Microcluster.
Since they only have 486SX25 processors (without floating point hardware), the microcluster machines are not fast. However, the entire cluster only weighs about 30 pounds and fits within a 1' cube. The miniature oak rack mount houses a four-machine TTL_PAPERS unit, power supplies, and space to pack both an extension cord and the cables for the TTL_PAPERS unit. For travel, there is an oak top plate that secures the laptops and provides a shoulder strap, while a plate with three wheels attaches to the bottom with velcro... in summary, it can go anywhere and can even be run using battery power.

Heterogeneous cluster.
Because Supercomputing '95 also marked the release of our scalable eight-machine TTL_PAPERS 951201 design, all day December 6, 1995, we demonstrated an eight-machine cluster using the obviously heterogeneous combination of the four ValuePoints and the four Aeros. A variation of the multi-voice music demo clearly demonstrated the tight coupling of machines within this cluster.

PAPERS history.
In addition to the TTL_PAPERS units that were operating in our booth, there was a TTL_PAPERS 950801 unit and a wooden plant rack holding various earlier PAPERS prototypes. While we are on the topic, the quality of the woodwork for the PAPERS cabinets was yet again heading the list of comments from visitors to our booth... maybe there is a message there for commercial computer vendors...?

The software side.
On the other side of our booth, just past the circle of chairs gathered around the ValuePoint cluster, were two tables for the software demonstrations for our booth. The KIWI project took one table, the TTL_PAPERS (and TTL_VAPERS simulator) library took the other. We also had a handout on the new giveioperm() system call for secure direct port access under Linux.

PAPERS booth people.
Well, after the description of what we did in our booth, it's kinda nice to have a photo of the folks who made the booth happen. From left to right, the PAPERS booth people are R. Hoare, R. Fisher, T. Mattox, S. Kim, and H. Dietz.

Incidentally, a lot of things are available on-line from the Supercomputing 1995 conference. The complete proceedings, abstracts for the exhibits, etc., are available from http://sc95.sdsc.edu/SC95.


Purdue ECE Exhibit at IEEE/ACM Supercomputing 1996

Supercomputing 1994 and 1995 both were good experiences for us, and we had much more to show for November 18-21, 1996. Because the 1996 conference was in Pittsburgh, which is well within "road trip" range, we decided to bring lots of equipment. We were given a 20' x 20' display area (booth R24), and filled it with as much as we could easily carry in our 15-foot rental truck... but we move our clusters in their racks, so quite a lot of stuff fit.... Our research booth held 37 separate computers and 27 monitors; this was apparently more separate machines and video displays than any other research or commercial exhibitor.

Exhibit Overview.
Because our booth faced a corner of the exhibit hall, we really didn't have a "front side" to our booth. Thus, we set-up our area so that people could move from display to display in a circle within the booth. From the left edge of the above photo, we had the 16 PC VGA video wall, SMP demos, 4 PC VGA video wall, history display, Pentium cluster, and CASLE demos.

16 PC VGA Video Wall.
The prime demonstration within our booth was the large video wall constructed as a 4 x 4 array using the VGA displays of 16 PCs. Each machine was connected only to its own VGA display, and the new field scalable 960801 PAPERS units were the only connection between the machines of the cluster. Our demos ranged from modified "screen savers" that treat the wall as a single display (a Qix-like one is shown above), synchronously drawing each point or line, to an interactive video game in which up to four players battle, each using a mouse to control the leader of their swarm. We also had a variety of pieces of multi-voice music playing across the PC speakers, with each new note given to a processor selected at random. Further, a new, somewhat crude, mini-OS running on top of Linux was used to control the execution of the cluster.

4 PC VGA Video Wall.
The bad news about the 16-machine cluster is that it was built using 386-based machines... including eight 386DX25 PS2 systems that were incredibly slow for floating point (no hardware) and had very slow MCA parallel ports (4us per register access). The four 486DX33 machines of the 4 PC VGA video wall are no speed demons, but they were able to run things like our combined N-body/thermal decay trace simulation in addition to the things we built for the 16 display wall. In the photo, they are running our four-player swarm video game.

Mandelbrot Demos.
Another interesting thing we did for the first time at Supercomputing 1996 was a side-by-side comparison of the new AFAPI (Aggregate Function Application Program Interface) running on both a PAPERS cluster and an SMP Linux box, as shown above. The application we used was a fully dynamically load-ballanced shared-memory version of Mandelbrot fractal computation, using AFAPI Replicated Shared Memory to implement a shared display map and an array used to asynchronously claim each scan line. Although AFAPI works well on both systems, the four Pentium 90s in the 960801 cluster were much faster than the two Pentium 100s within the SMP... even about 20% faster per processor. Why? Well, the SMP has significant memory system interference between processors....

History Display.
Supercomputing '96 had the subtheme of marking the 50th anniversary of the field, so history displays were strongly enouraged. Thus, we expanded the PAPERS history to include not only all previous PAPERS models, but also a write-up summarizing some of Purdue ECE's major contributions to the field of parallel processing. This write-up is available on-line both as HTML and as separate postscript files for the front and back sides of the one-page handout.

The CASLE Project.
We also demonstrated CASLE on both Linux PCs and DEC Alphas. CASLE is the Compiler/Architecture Simulation for Learning and Experimenting, a teaching tool that allows undergraduate students to develop an understanding of the total system impact of interactions between compiler optimizations and (modestly parallel) architectural features. More information about CASLE, including a live system that you can use with any WWW browser, is available at http://purcell.ecn.purdue.edu/~casle/.

Booth people.
Well, after the description of what we did in our booth, it's kinda nice to have a photo of the folks who made the booth happen. From left to right, the PAPERS booth people are T. Mattox, R. Hoare, R. Fisher, and S. Kim, with the two faculty, H. Dietz and G. Adams, sitting in front. Of course, in addition to us, we like to thank the over 500 people who visited our research exhibit....

The Polaris Project.
This year, Purdue's presence at Supercomputing was not limited to our exhibit. Prof. Rudy Eigenmann also organized a second 20'x20' research exhibit, for the Polaris source-to-source parallelizing Fortran compiler. More information about the Polaris project is available at http://dynamo.ecn.purdue.edu/~eigenman/polaris/.

Incidentally, a lot of things are available on-line from the Supercomputing 1996 conference. The complete proceedings, abstracts for the exhibits, etc., are available from http://scxy.tc.cornell.edu/sc96/.


The next public demonstration of PAPERS at a conference has not happened yet... but when it does, it will be listed here. Click here to go back to the main listing.


The Aggregate. The only thing set in stone is our name.