Assignment 0: Dumb And Dumber

As described in class, in this project you will modify the "dumber" compiler distributed as source code so that it is slightly less "dumb" by incorporating a variety of simple, local, optimizations. This dumber compiler is available as a TAR file or as the individual pieces (better for WWW browsing). Makefile is the make file. tup.h is the header file. tup1.c is the main program. tup2.c is the parser. tup3.c is the lexical analyzer. tup4.c is the symbol table. tup5.c is the code generation stuff; the only part you'll really need to modify.

Many simplifications have been made to keep the project down to a manageable size. One simplification is that your program will output code triples, not assembly language instructions. Further, the parser being used accepts a small subset of C involving only int scalars and one-dimensional arrays. Perhaps the biggest simplification is that efficiency in your analysis routines is not required -- it's fine to do linear searches for things in this project. As you will see, basic blocks are typically quite small, hence linear searches are not really all that inefficient.

You have been given a parser that calls a few functions which are expected to generate triples. These few functions are also given to you -- but the versions supplied do not perform any analysis nor optimization: that is your task. Towards that, you are also given the needed algorithms (in a somewhat imperfect form).

Development Procedure

This project has been scaled-down tremendously, but it still isn't trivial. Since you are given a functional program which generates "dumb" triples, you should try to enhance it in small increments, testing it as you go.

First, using the algorithms presented below and the non-optimizing versions provided, modify/create C functions to perform the flow independent analysis and optimization. The basic block forward optimizations will come next, then the backward ones. Upgrade one function at a time, preferably for one step at a time; debugging your code will be very difficult if you don't incrementally test it.

Please do not submit a program that "does" lots of stuff but doesn't even compile -- you'll get a much better grade for a project that does less, but really does it.

Stuff To Know About

You are not writing this project from scratch. You are provided with algorithms, data structures, support book-keeping functions, and an example.

Tuple Instruction Set

The output of the parser is a sequence of calls to create tuple code. the form of the calls is described in the following table. The t variables are tuple * values, also used as value numbers: the value generated by a tuple is numbered by the address of that tuple within the compiler. The address of a variable entry in the symbol table is represented by v, which is of type var *. Symbolic labels used to link blocks together are represented by int values denoted as l. An int constant value is represented by c.

Opcode Tuple Notation Effect (C Code) Generated By Calling

ADD add(t1, t2) t1+t2 binop(ADD, t1, t2)

SUB sub(t1, t2) t1-t2 binop(SUB, t1, t2)

AND and(t1, t2) t1&t2 binop(AND, t1, t2)

OR or(t1, t2) t1|t2 binop(OR, t1, t2)

XOR xor(t1, t2) t1^t2 binop(XOR, t1, t2)

GT gt(t1, t2) t1>t2 binop(GT, t1, t2)

GE ge(t1, t2) t1>=t2 binop(GE, t1, t2)

EQ eq(t1, t2) t1==t2 binop(EQ, t1, t2)

SSL ssl(t1, t2) t1<<t2 binop(SSL, t1, t2)

SSR ssr(t1, t2) t1>>t2 binop(SSR, t1, t2)

CONST const(c) c cop(c)

LD ld(v) v ldop(v)

LDX ldx(v, t) v[t] ldxop(v, t)

ST st(v, t) v=t stop(v, t)

STX stx(v, t1, t2) v[t1]=t2 stxop(v, t1, t2)

LAB label(l) l: labop(l)

SEL sel(t, l1, l2) if (t) goto l1; else goto l2; selop(t, l1, l2)

KILL kill(v) end of declared scope for v killop(v)

Opcode	Tuple Notation	Effect (C Code)	Generated By Calling
ADD	add(t1, t2)	`t1+t2`	`binop(ADD, t1, t2)`
SUB	sub(t1, t2)	`t1-t2`	`binop(SUB, t1, t2)`
AND	and(t1, t2)	`t1&t2`	`binop(AND, t1, t2)`
OR	or(t1, t2)	`t1\|t2`	`binop(OR, t1, t2)`
XOR	xor(t1, t2)	`t1^t2`	`binop(XOR, t1, t2)`
GT	gt(t1, t2)	`t1>t2`	`binop(GT, t1, t2)`
GE	ge(t1, t2)	`t1>=t2`	`binop(GE, t1, t2)`
EQ	eq(t1, t2)	`t1==t2`	`binop(EQ, t1, t2)`
SSL	ssl(t1, t2)	`t1<<t2`	`binop(SSL, t1, t2)`
SSR	ssr(t1, t2)	`t1>>t2`	`binop(SSR, t1, t2)`
CONST	const(c)	`c`	`cop(c)`
LD	ld(v)	`v`	`ldop(v)`
LDX	ldx(v, t)	`v[t]`	`ldxop(v, t)`
ST	st(v, t)	`v=t`	`stop(v, t)`
STX	stx(v, t1, t2)	`v[t1]=t2`	`stxop(v, t1, t2)`
LAB	label(l)	`l:`	`labop(l)`
SEL	sel(t, l1, l2)	`if (t) goto l1; else goto l2;`	`selop(t, l1, l2)`
KILL	kill(v)	`end of declared scope for v`	`killop(v)`

Notice that label(l) and kill(v) are not really executable operations, but rather provide information derived directly from the source program.

Local Optimizations

The following local optimizations are to be performed. Notice that some operations are commutative, in which case the operand order should not hamper the local optimization. As discussed in class, these cases can best be dealt with by imposing a normalized order.

Initial Operation Optimized Result Why?

add(t, const(0)) t arithmetic identity

add(t1, t2) where t1==t2 ssl(t1, 1) opposite(reduction in strength)

sub(t, const(0)) t arithmetic identity

sub(t1, t2) where t1==t2 const(0) arithmetic identity

and(t, const(0)) const(0) arithmetic identity

and(t, const(-1)) t arithmetic identity

and(t1, t2) where t1==t2 t1 arithmetic identity

or(t, const(0)) t arithmetic identity

or(t, const(-1)) const(-1) arithmetic identity

or(t1, t2) where t1==t2 t1 arithmetic identity

xor(t, const(0)) t arithmetic identity

xor(t1, t2) where t1==t2 const(0) arithmetic identity

gt(t1, t2) where t1==t2 const(0) arithmetic identity

ge(t1, t2) where t1==t2 const(1) arithmetic identity

eq(t1, t2) where t1==t2 const(1) arithmetic identity

ssl(t, const(0)) t arithmetic identity

ssl(const(0), t) const(0) arithmetic identity

ssr(t, const(0)) t arithmetic identity

ssr(const(0), t) const(0) arithmetic identity

ssr(const(-1), t) const(-1) arithmetic identity

ldx(v, const(0)) ld(v) normalization

stx(v, const(0), t) st(v, t) normalization

sel(const(0), l1, l2) sel(-, l2, l2) normalization

sel(const(k), l1, l2) where k!=0 sel(-, l1, l1) normalization

Initial Operation	Optimized Result	Why?
add(t, const(0))	t	arithmetic identity
add(t1, t2) where t1==t2	ssl(t1, 1)	opposite(reduction in strength)
sub(t, const(0))	t	arithmetic identity
sub(t1, t2) where t1==t2	const(0)	arithmetic identity
and(t, const(0))	const(0)	arithmetic identity
and(t, const(-1))	t	arithmetic identity
and(t1, t2) where t1==t2	t1	arithmetic identity
or(t, const(0))	t	arithmetic identity
or(t, const(-1))	const(-1)	arithmetic identity
or(t1, t2) where t1==t2	t1	arithmetic identity
xor(t, const(0))	t	arithmetic identity
xor(t1, t2) where t1==t2	const(0)	arithmetic identity
gt(t1, t2) where t1==t2	const(0)	arithmetic identity
ge(t1, t2) where t1==t2	const(1)	arithmetic identity
eq(t1, t2) where t1==t2	const(1)	arithmetic identity
ssl(t, const(0))	t	arithmetic identity
ssl(const(0), t)	const(0)	arithmetic identity
ssr(t, const(0))	t	arithmetic identity
ssr(const(0), t)	const(0)	arithmetic identity
ssr(const(-1), t)	const(-1)	arithmetic identity
ldx(v, const(0))	ld(v)	normalization
stx(v, const(0), t)	st(v, t)	normalization
sel(const(0), l1, l2)	sel(-, l2, l2)	normalization
sel(const(k), l1, l2) where k!=0	sel(-, l1, l1)	normalization

Constant Folding

Any binary operation in which both arguments are constant-valued should be folded at compile time into a single constant. The operation table defines how to do this: simply execute the C code in the table which describes the effect. For example, add(const(601), const(94)) would become const(601 + 94), or const(695).

Due Dates, Submission Procedure, & Such

TBA.

Compiler Code Generation And Optimization