You've probably heard of a disturbingly large number of parallel supercomputer manufacturers having financial trouble, and some people might even say that parallel processing is dead, but that isn't true. What is true is that you cannot just go off for a few years building complex custom hardware for scientific applications and still be competitive with commodity computer technology.
In contrast, clusters or NOWs (Networks Of Workstations) seem to be a very cost-effective way to get lots of parallel compute power. However, a NOW using a conventional network really only gives you CPU power and block-transfer bandwidth. Traditional supercomputers have vastly superior ability for processors to closely coordinate their operation, perhaps even offering SIMD and VLIW execution modes.
As discussed in the Aggregate Function Network: Architecture & Theory page, our primary interest was in further improving the global coordination capability of parallel supercomputers. The problem was that, even with Purdue's tradition of building custom parallel hardware, we realized that we couldn't build a worthwhile full custom machine. And so it was that PAPERS came to look like a cluster or NOW... but it really isn't.
There are a variety of other projects working to reduce latency by building custom networks for clusters/NOWs supporting traditional message-passing, but that isn't what PAPERS is. We have no problem with using conventional network hardware for block transmission; it was, after all, designed for that. Thus, all that PAPERS adds to a cluster/NOW is support for tightly-coupled parallel processing... which a conventional-network cluster or NOW otherwise lacks even more desperately than most parallel supercomputers do.
In terms of programming, a PAPERS cluster behaves like a tightly-coupled parallel supercomputer. It just happens to be built by taking a conventional cluster or NOW and adding a PAPERS unit.
WAPERS, the Wired-AND Adapter for Parallel Execution and Rapid Synchronization, is a moderately-scalable fully passive network. In other words, this network hardware uses no active components, and consists entirely of a wiring pattern that takes advantage of SPP open-collector outputs to implement wired-AND logic. Two complete alternative designs for WAPERS are described in detail (.pdf, .ps, .ps.gz, .html).
WAPERS supports the full user-level AFAPI, and WAPERS AFAPI is
included in the unified AFAPI distribution.
The primary disadvantages of WAPERS are that it is lower
performance than TTL_PAPERS, does not scale to very large clusters,
and can fry your port hardware if things are not configured
correctly. However, this is the simplest way to connect a
cluster containing up to about 8 machines.
The cable shown in the photo is what we call CAPERS, the Cable Adapter for Parallel Execution and Rapid Synchronization. Although CAPERS differs from a standard "LapLink" in that CAPERS makes use of several additional ground wires, a "LapLink" cable can be used instead.
CAPERS is designed to passively connect the parallel ports of two PCs
or workstations. Using just this cable, it is not possible to
implement the OS interrupt handling support found in TTL_PAPERS;
however, it is possible to efficiently implement the complete
user-level AFAPI for the special case of just two machines.
Because WAPERS is no more complex to build and
allows larger clusters, CAPERS is now essentially obsolete.
The unit shown in the photo isn't the most sophisticated PAPERS version, but it is by far the most popular. Four processors are enough for reasonable experiments, and yet the unit is simple enough to be built in a day or two (well, at least we can do it that fast ;-). Although the unit uses just 8 TTL chips, it provides the complete TTL_PAPERS functionality.
Although we have built two other types of modularly-scalable eight-processor TTL_PAPERS units, and those designs are fully operational, we no longer recommend them for new construction. The TTL_PAPERS 960801 four-processor module, shown in its test mounting, is functionally equivalent, but is somewhat easier to build, field scalable, and a lot easier to debug.
There were a variety of delays in getting the full documentation
written, formatted, and posted, but the 960801 design has been solid
since late 1996 - and we use it extensively at Purdue. The 960801 hardware documentation is
available only as HTML with links to figures rather than in-line
figures (i.e., be sure to print copies of the figures you need as well
as the body of the HTML document).