Assignment 0: Basic Block Optimizer

A key difference between toy and production-quality compilers is the use of optimization techniques which require some level of flow-analysis within the code generators of production compilers. In this project, you will extend the BB compiler (BB.tgz) that we've been working through with routines implementing a basic block optimizer that:

Performs a variety of simple optimizations which do not require flow analysis.
Performs constant folding with value propagation.
Recognizes and eliminates redundant (common) subexpressions.

Your next project will add the backward flow stuff -- dead code elimination, automatic parallelization, and code scheduling. It will build on this, but one step at a time.... ;-)

Many simplifications have been made to keep the project down to a manageable size. One simplification is that your program will output code triples, not assembly language instructions. Further, the parser being used accepts groups of assignment statements involving only simple variables a through z and integers, with groups of assignments enclosed in {} -- each enclosed group of assignments is to be considered a separate basic block. Perhaps the biggest simplification is that efficiency in your analysis routines is not required -- it's fine to do linear searches for things in this project. (As you will see, basic blocks are typically quite small, hence linear searches are not really all that inefficient.)

To begin, you were supplied with a parser that calls a few functions which are expected to generate triples. We'll be using variants of this throughout this course, although I might replace the front end, it will have almost no impact on the back-end things you're doing. The few functions generating tuples are also given to you -- but the versions supplied do not perform any analysis nor optimization: that is your task. Towards that, you are also given the needed algorithms here -- in a somewhat imperfect form.

Development Procedure

This project has been scaled-down tremendously, but it still isn't trivial. Since you are given a functional program which generates "dumb" triples, you should try to refine it in small increments, testing it as you go. Using the algorithms presented below and the non-optimizing versions provided, modify/create C functions to:

Generate triples and incrementally perform the block optimizations discussed above
Upon completion of a block, output the optimized set of triples.

Please do not submit a program that "does" lots of stuff but doesn't even compile -- you'll get a much better grade for a project that does less, but really does it.

Stuff To Know About

You are not writing this project from scratch. You are provided with algorithms, data structures, support book-keeping functions, and an example. The output of the parser is a sequence of calls to create tuple code. the form of the calls is described in the following table. The t variables are tuple * values, also used as value numbers: the value generated by a tuple is numbered by the address of that tuple within the compiler. The address of a variable entry in the symbol table is represented by v, which is of type var *. Symbolic labels used to link blocks together are represented by int values denoted as l. An int constant value is represented by c.

Opcode Tuple Notation Effect Generate by Calling

ADD add(t1, t2) t1 + t2 binop(ADD, t1, t2)

SUB sub(t1, t2) t1 - t2 binop(SUB, t1, t2)

AND and(t1, t2) t1 & t2 binop(AND, t1, t2)

OR OR(t1, t2) t1 | t2 binop(OR, t1, t2)

XOR xor(t1, t2) t1 ^ t2 binop(XOR, t1, t2)

GT gt(t1, t2) t1 > t2 binop(GT, t1, t2)

GE ge(t1, t2) t1 >= t2 binop(GE, t1, t2)

EQ eq(t1, t2) t1 == t2 binop(EQ, t1, t2)

SSL ssl(t1, t2) t1 << t2 binop(SSL, t1, t2)

SSR ssr(t1, t2) t1 >> t2 binop(SSR, t1, t2)

CONST const(c) c cop(c)

LD ld(v) v ldop(v)

LDX ldx(v, t) v[t] ldxop(v, t)

ST st(v, t) v = t stop(v, t)

STX stx(v, t1, t2) v[t1] = t2 stxop(v, t1, t2)

LAB label(l) l: labop(l)

SEL sel(t, i1, i2) goto (t ? l1 : l2) selop(t, l1, l2)

KILL kill(v) end of v's declared scope killop(v)

Opcode	Tuple Notation	Effect	Generate by Calling
`ADD`	`add(t1, t2)`	`t1 + t2`	`binop(ADD, t1, t2)`
`SUB`	`sub(t1, t2)`	`t1 - t2`	`binop(SUB, t1, t2)`
`AND`	`and(t1, t2)`	`t1 & t2`	`binop(AND, t1, t2)`
`OR`	`OR(t1, t2)`	`t1 \| t2`	`binop(OR, t1, t2)`
`XOR`	`xor(t1, t2)`	`t1 ^ t2`	`binop(XOR, t1, t2)`
`GT`	`gt(t1, t2)`	`t1 > t2`	`binop(GT, t1, t2)`
`GE`	`ge(t1, t2)`	`t1 >= t2`	`binop(GE, t1, t2)`
`EQ`	`eq(t1, t2)`	`t1 == t2`	`binop(EQ, t1, t2)`
`SSL`	`ssl(t1, t2)`	`t1 << t2`	`binop(SSL, t1, t2)`
`SSR`	`ssr(t1, t2)`	`t1 >> t2`	`binop(SSR, t1, t2)`
`CONST`	`const(c)`	`c`	`cop(c)`
`LD`	`ld(v)`	`v`	`ldop(v)`
`LDX`	`ldx(v, t)`	`v[t]`	`ldxop(v, t)`
`ST`	`st(v, t)`	`v = t`	`stop(v, t)`
`STX`	`stx(v, t1, t2)`	`v[t1] = t2`	`stxop(v, t1, t2)`
`LAB`	`label(l)`	`l:`	`labop(l)`
`SEL`	`sel(t, i1, i2)`	`goto (t ? l1 : l2)`	`selop(t, l1, l2)`
`KILL`	`kill(v)`	end of `v`'s declared scope	`killop(v)`

Notice that label(l) and kill(v) are not really executable operations, but rather provide information derived directly from the source program.

Local Optimizations

The following local optimizations are to be performed. Notice that some operations are commutative, in which case the operand order should not hamper the local optimization.

Initial Operation Optimized Result

add(t, const(0)) t

add(t1, t2) where t1==t2 ssl(t1, 1)

sub(t, const(0)) t

sub(t1, t2) where t1==t2 const(0)

and(t, const(0)) const(0)

and(t, const(-1)) t

and(t1, t2) where t1==t2 t1

or(t, const(0)) t

or(t, const(-1)) const(-1)

or(t1, t2) where t1==t2 t1

xor(t, const(0)) t

xor(t1, t2) where t1==t2 const(0)

gt(t1, t2) where t1==t2 const(0)

ge(t1, t2) where t1==t2 const(1)

eq(t1, t2) where t1==t2 const(1)

ssl(t, const(0)) t

ssl(const(0), t) const(0)

ssr(t, const(0)) t

ssr(const(0), t) const(0)

ldx(v, const(0)) ld(v)

stx(v, const(0), t) st(v, t)

sel(const(k), l1, l2) where k!=0 sel(-, l1, l1)

sel(const(0), l1, l2) where k!=0 sel(-, l2, l2)

Initial Operation	Optimized Result
`add(t, const(0))`	`t`
`add(t1, t2)` where `t1==t2`	`ssl(t1, 1)`
`sub(t, const(0))`	`t`
`sub(t1, t2)` where `t1==t2`	`const(0)`
`and(t, const(0))`	`const(0)`
`and(t, const(-1))`	`t`
`and(t1, t2)` where `t1==t2`	`t1`
`or(t, const(0))`	`t`
`or(t, const(-1))`	`const(-1)`
`or(t1, t2)` where `t1==t2`	`t1`
`xor(t, const(0))`	`t`
`xor(t1, t2)` where `t1==t2`	`const(0)`
`gt(t1, t2)` where `t1==t2`	`const(0)`
`ge(t1, t2)` where `t1==t2`	`const(1)`
`eq(t1, t2)` where `t1==t2`	`const(1)`
`ssl(t, const(0))`	`t`
`ssl(const(0), t)`	`const(0)`
`ssr(t, const(0))`	`t`
`ssr(const(0), t)`	`const(0)`
`ldx(v, const(0))`	`ld(v)`
`stx(v, const(0), t)`	`st(v, t)`
`sel(const(k), l1, l2)` where `k!=0`	`sel(-, l1, l1)`
`sel(const(0), l1, l2)` where `k!=0`	`sel(-, l2, l2)`

Constant Folding

Any binary operation in which both arguments are constant-valued should be folded at compile time into a single constant. The operation table defines how to do this: simply execute the C code in the table which describes the effect. For example, add(const(601), const(94)) would become const(601 + 94), or const(695).

Algorithms

As we have already mentioned, there are many different kinds of basic block optimizations, only a few of which you will implement in this project. We are also ignoring the simple local improvements discussed earlier -- e.g., the various peephole and other simple optimizations. The following algorithms are presented as rough guidelines for your functions. These are deliberately incomplete and imperfect... you must decide where and how to do what is missing or ambiguous.

`binop` -- process a binary operation

tuple *
binop(opcode o, tuple *t0, tuple *t1)
{
  /* simplify later matching by normalizing operand order */
  if (o is commutative) {
    normalize order of t0 and t1;
  }

  /* local special-case optimizations */
  if (local optimization applies) {
    perform local optimization;
    return(ptr. to result tuple);
  }

  /* constant folding */
  if (both args are constants) {
    fold the expression into a single constant value;
    return( cop(the resulting value) );
  }

  /* common subexpression elimination */
  if (equivalent tuple is available) {
    return(ptr. to available tuple);
  }

  /* some things just can't be optimized away...  yet */
  make a new tuple for o(t0, t1);
  mark tuple as non-constant-valued;
  assign the tuple a new value number;
  return(ptr. to this tuple);
}

`cop` -- process a constant

tuple *
cop(int c)
{
  /* constant pooling (common subexpression elimination) */
  if (equivalent tuple is available) {
    return(ptr. to available tuple);
  }

  /* some things just can't be optimized away...  yet */
  make a new tuple for const(c);
  mark tuple as non-constant-valued;
  assign the tuple a new value number;
  return(ptr. to this tuple);
}

`ldop` -- process a simple load

tuple *
ldop(var *v)
{
  /* value propagation */
  if (current value of var[0] is an available tuple) {
    return(ptr. to available tuple);
  }

  /* some things just can't be optimized away...  yet */
  update symbol table entry for v[0];
  make a new tuple for ld(v);
  mark tuple as non-constant-valued;
  assign the tuple a new value number;
  return(ptr. to this tuple);
}

`ldxop` -- process an indexed load

tuple *
ldxop(var *v, tuple *t)
{
  /* value propagation */
  if (current value of var[t] is an available tuple) {
    return(ptr. to available tuple);
  }

  /* some things just can't be optimized away...  yet */
  update symbol table entry for v[t];
  make a new tuple for ldx(v, t);
  mark tuple as non-constant-valued;
  assign the tuple a new value number;
  return(ptr. to this tuple);
}

`stop` -- process a simple store

tuple *
stop(var *v, tuple *t)
{
  /* symbol table stuff... */
  delete all v[t0] entries where t0 is not constant;
  update symbol table entry for v[0];

  /* some things just can't be optimized away...  yet */
  update symbol table entry for v:
  if (there exist entries for v[t] where t is not constant) {
    delete entry for v[t];
  }
  make a new tuple for st(v, t);
  mark tuple as non-constant-valued;
  assign the tuple a new value number;
  return(tuple ptr. t);
}

`stxop` -- process an indexed store

tuple *
stxop(var *v, tuple *t1, tuple *t2)
{
  /* symbol table stuff... */
  if (t1 is a constant) {
    delete all v[t] entries where t is not constant;
  } else {
    delete all v[t] entries;
  }
  update symbol table entry for v[t1];

  /* some things just can't be optimized away...  yet */
  make a new tuple for stx(v, t1, t2);
  mark tuple as non-constant-valued;
  assign the tuple a new value number;
  return(tuple ptr. t2);
}

`labop` -- process a label definition

tuple *
labop(int l)
{
  /* some things just can't be optimized away...  yet */
  make a new tuple for label(l);
  update label table entry for l;
  return(ptr. to this tuple);
}

`selop` -- process a select (conditional jump)

tuple *
selop(tuple *t, int l1, int l2)
{
  /* local special-case optimizations */
  if (sel tests a constant-valued tuple) {
    make t = NULL; /* empty marker */
    replicate the appropriate label;
  }

  /* some things just can't be optimized away... yet */
  make a new tuple for sel(t, l1, l2);
  return(ptr. to this tuple);
}

`killop` -- process an end-of-scope

tuple *
killop(var *v)
{
  /* not really an operation for forward flow.... */
  update symbol table entry for v to say v is dead;
  make a new tuple for kill(v);
  return(ptr. to this tuple);
}

Tuple Structure

The basic tuple structure is:

#define tuple struct _tuple
tuple {
  tuple *next, *prev; /* sequential order links */
  opcode oarg;        /* tuple opcode */
  tuple *targ[2];     /* tuple arguments */
  konst carg;         /* constant argument */
  label larg[2];      /* label arguments */
  var *varg;          /* variable name argument */
};

where the use of fields is obvious from the tuple descriptions given earlier.

All the tuple code generated will be linked into a circular linked list whose permanent header node is code. Upon end of input, the tuples linked into the code list are automatically printed.

The Symbol Table

In the above, there are several references to the handling of symbol table entries. Symbol table entries are described by var * values which directly access them. A partial definition of the var structure is:

#define var struct _var
var {
  char *text;  /* pointer to string */
  int type;    /* token type of variable (e.g., KIF) */
  int dim;     /* size of dimension (1 means scalar) */
  ...
};

It will be necessary for you to add fields to this structure so that your tracking of value numbers (i.e., tuple pointers) for particular subscripts can be accommodated. Notice that dim will be 1 for a scalar -- i.e., int a[1]; and int a; make identical var entries.

Scoping is handled automatically and, in anticipation of the global optimizer, var entries are never deleted. You may assume that the var * values passed to your routines are always valid.

Details For The Implementor's Notes

You should be submitting an implementor's notes document with your project. The implementor's notes do not need to repeat anything in the assignment handout, nor do they need to be very verbose -- I doubt you'll need more than one page for this assignment. All it should cover is any implementation choices you made that aren't obvious from looking at the code. Think in terms of what things you'd want to know to read, understand, and maintain your code if you were looking at this project 6 months from now.

Also be sure to document any problems or bugs your code has. There's a joke in the UNIX world that "a documented bug is called a feature" -- that's not as wrong as it sounds. So, any problem your project has, I'll only take half as many points off if that issue is documented. In fact, if all you turn in is a document saying "nothing works because I didn't get around to doing the project" I'll give you 55%: 10% for having submitted an implementor's notes document plus half of the remaining possible 90%. Only your last submission before the deadline will be graded, so perhaps you should start by immediately making and submitting your implementor's notes document?

For later projects, I want you to follow the formatting guidelines described here for CPE480 in producing a PDF document, however, I'll be flexible for this first assignment. Even a plain text file would be acceptable.

Due Dates

The due date for this assignment is 11:59PM March 15, 2021. You may submit as many times as you wish, but only the last submission that you make before the final deadline will be counted toward your course grade.

Submission Procedure

For each project, you will be submitting a tarball (i.e., a file with the name ending in .tar or .tgz) that contains all things relevant to your work on the project. Minimally, each project tarball includes the source code for the project and a semi-formal "implementors notes" document as a PDF named notes.pdf. It also may include test cases, sample output, a make file, etc., but should not include any files that are built by your Makefile (e.g., no binary executables). Make it very clear in your implementor's notes exactly what each file is.

Submit your tarball below. The file can be either an ordinary .tar file created using tar cvf file.tar yourprojectfiles or a compressed .tgz file file created using tar zcvf file.tgz yourprojectfiles. Be careful about using * as a shorthand in listing yourprojectfiles on the command line, because if the output tar file is listed in the expansion, the result can be an infinite file (which is not ok).

Optimizing Compilers

Assignment 0: Basic Block Optimizer

Development Procedure

Stuff To Know About

Local Optimizations

Constant Folding

Algorithms

binop -- process a binary operation

cop -- process a constant

ldop -- process a simple load

ldxop -- process an indexed load

stop -- process a simple store

stxop -- process an indexed store

labop -- process a label definition

selop -- process a select (conditional jump)

killop -- process an end-of-scope