Assignment 1: Going Undercover

As described in class, in this project you will write and run a perhaps not-so-simple GA to solve the problem of designing a Fractional Flat Neighborhood Network (FFNN).

What Is A FFNN And How Do I Design One?

A FFNN is a special class of computer network design in which nodes are connected to switches in a pattern that allows as many as possible specified pairs of nodes to meet in at least one switch. In other words, the goal is to find a network design that covers as many as possible of the specified pairs. There are three constraints:

Each switch can have only a modest, fixed, maximum number of Ports that can be used to connect to nodes or other switches, typically, no more than 48. In a FFNN, switches only can be connected to nodes. Thus, this determines the maximum number of nodes that can meet in a particular switch.
Each node can have only a small, fixed, maximum number of Network Interface Cards (NICs), typically, no more than 6. Thus, a node cannot connect to more than this number of switches.
Only the pairs specified count, as opposed to a "universal" FNN in which all pairs are specified.

Constraint 1 is essentially the same constraint used in the standard graph covering problem: the maximum subset size. The second and third constraints are not part of the standard graph covering problem. If we were building a "universal" FNN and did not have the second constraint, the design problem would be precisely the (v,k,t)-covering design problem in which v is the node count, k is the number of ports per switch, and t is 2. In fact, you should browse some of the discussion at the La Jolla Covering Repository to get some insights as to how the SFNN design problem can be approached without using a GA. These are not bad starting points for how to make your GA "smarter."

The GA structure is entirely up to you. In fact, you need not code your GA from scratch; it is perfectly allowable for you to use an existing GA library or tool to create your GA... although I wouldn't recommend that.

Input To Your Program

The input to this program looks a lot like the input to your previous program... in fact, it can be identical. The only problem is that randomly-placed weights are much harder to cover than weights corresponding to common communication patterns in real parallel programs, so don't be surprized if your GA has trouble finding complete coverings for borderline cases. Then again, that's why you're doing an FFNN and not an SFNN: for an SFNN, you have to cover all specified pairs, not just as many as you can. The assignment input can be summarized as differing from the previous project in that:

There will be no more than a total of 1024 nodes
The first command-line argument to your GA must be the maximum number of NICs per node; this number will be 6 or less
The second command-line argument to your GA must be the maximum number of ports per switch; this number will be 64 or less

The mkg.c program from the previous project can make reasonable input... but a different program is being created to generate better designs. For the moment, Tim Mattox has provided a version of his tool called mkpairs.c; it can be used to create the input specifications for this project. For example, to create the design specification for KASY0's network:

Compile mkpairs by cc mkpairs.c -o mkpairs -O2 -lm
Run ./mkpairs 02010770 128 >kasy0spec

The run will give you some debugging output (to stderr) like:

Pattern Generated: 1D torus +/- 1      offsets
Pattern Generated: 2D torus all of row and col ( 16,  8)
Pattern Generated: 3D torus all of row and col (  8,  4,  4)
Pattern Generated: bit-reversal
Pattern Generated: hypercube
PairCt =   1536, Min = 23, Avg =  24.00, Max = 25, flags = 02010770

To design a network suitable for KASY0, you would use the resulting file, kasy0spec, as input and give command-line arguments telling your program that there are 3 NICs/node and 23 ports/switch.

Don't expect your design to match the one we used... that's highly unlikely. Your design doesn't even have to be as good for you to get full credit. You have have to have done something reasonable in the coding of the problem to be solved by a GA.

Your Program's Output

Every weighting table entry is worth some value between 0 and 1. The score of a particular network design is simply the sum of all the weighting pairs it covers. You want to find a wiring pattern for which this sum is maximized. If you find a wiring pattern that has the theoretically maximal value, you should stop immediately -- not continue the search for another generation.

As a minimum, your program should output the score for the best network design it found and the complete wiring pattern in the same format we have described all along: each line starts with the switch number followed by a colon and the list of node numbers that are connected to that switch. For example, KASY0's actual SFNN design is:

0: 0 15 16 31 32 47 48 63 64 79 80 95 96 111 112 120 121 122 123 124 125 126 127
1: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 24 40 56 72 88 104 120
2: 7 23 39 55 71 87 103 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127
3: 2 10 18 26 33 34 35 37 42 43 44 45 46 50 58 66 74 82 90 98 106 114 122
4: 5 13 21 29 37 45 53 61 69 77 81 83 84 85 86 89 92 93 94 101 109 117 125
5: 1 9 17 25 33 41 49 57 64 65 67 68 70 73 75 76 78 81 89 97 105 113 121
6: 4 12 16 18 19 20 22 23 27 28 30 36 44 52 60 68 76 84 92 100 108 116 124
7: 3 11 19 27 35 43 51 59 67 75 83 91 96 98 99 102 103 104 107 109 110 115 123
8: 6 14 22 30 38 46 48 50 53 54 55 56 59 60 62 70 78 86 94 102 110 118 126
9: 32 36 38 39 40 41 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
10: 2 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 69 71 72 73 74 79
11: 64 65 66 67 68 69 70 71 72 74 75 76 77 78 79 80 82 87 88 90 91 93 95
12: 5 17 20 21 24 26 29 31 80 81 82 83 84 85 86 87 88 89 90 91 92 94 95
13: 7 8 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 100 101 105 111
14: 66 73 77 85 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 117 119
15: 49 51 52 54 57 58 61 62 63 65 93 97 99 106 107 108 113 114 115 116 118 119 127
16: 0 1 3 4 6 8 9 10 11 12 13 14 25 28 34 42

Although KASY0's design is given with nodes connected to each switch listed in increasing node number order, your listing need not order node numbers within each switch.

Hints

Here's a good approach:

Think about the tricks mentioned in class and the non-GA methods discussed in the paper New Constructions for Covering designs. From these ideas, derive a basic geneome representation.
Write code to handle your geneome representation. You at least need functions to evaluate the merit (score) of a geneome and to meaningfully display the corresponding pheneome (the network wiring pattern output described above).
Build yourself a dumb exhaustive search using that geneome representation and evaluation function. You will not be able to use this with large test cases, but it will be very useful in making sure at least small test cases work well with the GA you'll be building next....
Build the GA. Don't be too clever about your crossover and mutation, just do the basics.
Test your GA against the exhaustive search for some trivially small cases. Keep in mind that the GA shouldn't necessarily get the best answer every time, nor is it likely to be faster than the exhaustive search on small tests. In short, make sure your GA works like you think it does.
Try running the KASY0 design problem and/or other larger problems.
If the GA doesn't do well on the big test cases, it's time to start thinking about different population makeups (e.g., changing population size, fraction crossover or mutation, etc.), generic GA tricks like using island populations, and smarter crossover and mutation operators. Make an appropriately modified version of the GA and repeat from step 5 until you're happy with the results.

http://aggregate.org/EC/ (Advanced) Evolutionary Computing

Assignment 1: Going Undercover

What Is A FFNN And How Do I Design One?

Input To Your Program

Your Program's Output

Hints

Due Dates, Submission Procedure, & Such