EE699 Assignment 1a: Sort-Of Interesting

In this project, you will write a C program using LAM-MPI to run on a cluster supercomputer. The input to your program will consist of pairs of integers: key value and datum. Your task is simply to sort the pairs by key value into increasing order and output the sorted result. The goal is simply to achieve the best possible speedup. Your sort need not be stable, i.e., it need not maintain elements with duplicate keys in their original order.

Context

The input to your program will consist of a series of 32-bit binary integers (int values) in a completely raw binary format. If the data fit in memory, then, given the following C struct:

typedef struct { int key, datum; } item;
item a[MAXSORT];

Your program could read the entire key-datum pair array from the standard input by:

read(0, &a, (MAXSORT * sizeof(a[0])));

Of course, you should ensure that only one processor is reading the standard input. Your program's output is sent to the standard output in a very similar way:

write(1, &a, (MAXSORT * sizeof(a[0])));

To further simplify things, you will be given a few values for important constants:

MAXSORT: This will be the maximum number of key-datum pairs which your program will be given... which is precisely how many pairs it actually will be given. You may assume that MAXSORT is always a power of 2.
NPROC: This will be the number of processors that your program can use (i.e., the number of uniprocessor nodes in the cluster).
MAXMEM: This will be the approximate total number of bytes of memory available in each node. Note that MAXSORT is selected for 599 students to be such that all key-datum pairs fit in memory of a single system, whereas 699 students will find that MAXSORT may be too large. Note that the value of MAXMEM is not necessarily the same as the system limit enforced by Linux, but your program plus data memory use should be kept within this bound.

The Helper

The program sorthelp.c is provided to help you with this project. This helper code is actually four different programs, which you compile by:

gcc sorthelp.c -o binary -O2
ln -f binary check
ln -f binary generate
ln -f binary view

The above compilation trick gives you four different names for the same executable file... which looks at the name it was executed with in order to determine its function. This same trick is used for lots of unix programs; for example, gzip, gunzip, and zcat are actually the same program. Here's an overview of the functionality of the project helper:

binary maxsort: Convert an ASCII, human-readable, list of maxsort pairs of integers into the binary form used for this assignment. The pairs are read from the standard input and the binary result is placed on the standard output.
check maxsort: Read maxsort pairs of binary-coded values from the standard input and complain if they are not in sorted order. The complaint is in the form of an error message printed to the standard error output and the program terminating with a non-zero exit code.
generate maxsort seed: Generate a list of maxsort binary-coded pairs to the standard output. The pairs have well-randomized 32-bit key values, but the datum value simply increments from 0. Note that maxsort is actually a 64-bit value, so the 32-bit datum fields will have duplicate values if maxsort is large enough; duplicate datum fields do not imply duplicate key-datum pairs. The seed value is optional and is used to ensure that the exact same random sequence is generated for multiple runs of generate; not specifying a seed value causes generate to use the number of seconds since the start of 1970 as the seed.
view maxsort: The opposite of binary, view converts a binary-coded list of maxsort pairs of integers into an ASCII, human-readable, form with one pair of values per line. The pairs are read from the standard input and the ASCII result is placed on the standard output.

Normally, all you should need is generate and check. However, the other programs allow you to do rather neat things in terms of using various unix tools. For example, the standard text-based unix sort utility can be used via a pipe like:

./generate 1000 |./view 1000 |sort -n |./binary 1000 |./check 1000

In other words, a very low-performance, sequential, simulation of your program for the case of MAXSORT being 1,000 is simply:

./view 1000 |sort -n |./binary 1000

Documentation and Other Administrivia

Test your program using LAM on a single-processor system, e.g., in the 577 FPAT lab. Good test parameters (for cases that are small enough for you to examine by hand) are:

#define	MAXSORT	1024
#define	NPROC	4
#ifdef	EE599
#define	MAXMEM	32768
#else
#define	MAXMEM	4096
#endif

Notice that the memory space per logical processor is plenty for EE599, but only about half what you'd like for EE699.

Once you are confident that your code is functional, request access to KLAT2. At that time, you'll be given a specific test case to benchmark on KLAT2. Aside from performance tweaks, your code shouldn't need any changes to run on KLAT2. You do that using this set of instructions with the following parameters:

#define	MAXSORT	(24*1024*1024)
#define	NPROC	64
#ifdef	EE599
#define	MAXMEM	(256*1024*1024)
#else
#define	MAXMEM	(128*1024*1024)
#endif

So, the undergraduate and graduate students have the same problem to solve, but graduate students have less memory... Alzhiemer's kicks-in pretty early, eh?

What you will hand-in for this project is a brief "implementor's notes" file (psort.html) and source and make files to construct and run an executable named psort using LAM MPI. In your write-up, be sure to document the speed you get sorting the test case given to you to run on KLAT2.

EE699 Assignment 1a: Sort-Of Interesting

Context

The Helper

Documentation and Other Administrivia

Submission Form