SIMDC'12 Assembler

The assembler project is fairly straightforward. Notice that, even though this project is apparently very different from the compiler, it can share a very large majority of its code with the parser. The main difference is really that expressions are evalated to produce integer constants, rather than translated into code; integer values can easily be passed back using the return values of the expression parsing routines.

The AIK sample solution provides a good reference to compare your C-coded assembler against. However, it is a little more complex than your assembler needs to be. The following (loose) grammar describes the input your assembler must handle.

prog: ({stat} '\n')*
    ;

stat: WORD (({':'} {inst}) | (("EQU" | "SET") expr))
    | inst
    | "MONO"
    | "POLY"
    | "CODE"
    | "IF" expr
    | "IFREF" WORD
    | "ELSE"
    | "END"
    ;

inst: "AND" reg ',' reg ',' reg
    | "OR"  reg ',' reg ',' reg
    | "XOR" reg ',' reg ',' reg
    | "LT" reg ',' reg ',' reg
    | "MUL"  reg ',' reg ',' reg
    | "ADD" reg ',' reg ',' reg
    | "CONST" reg ',' expr
    | "NEG" reg ',' reg
    | "NOT" reg ',' reg
    | "LNOT" reg ',' reg
    | "LD" reg ',' reg
    | "ST" reg ',' reg
    | "PUT" reg ',' reg ',' reg
    | "GET" reg ',' reg
    | "JZ" reg ',' reg
    | "PUSH" reg
    | "POP" reg
    | "DZ" reg
    | "SEN"
    | "REN"
    | "WORD" expr
    ;

reg: ('$' | '@') expr
   ;

expr: expr2 ('|' expr2)*
     ;

expr2: expr3 ('^' expr3)*
     ;

expr3: expr4 ('&' expr4)*
     ;

expr4: expr5 (('<' | '>') expr5)*
     ;

expr5: expr6 (('+' | '-') expr6)*
     ;

expr6: expr8 ('*' expr8)*
     ;

expr8:
     | NUMBER
     | WORD
     | '-' expr8
     | '!' expr8
     | '~' expr8
     | '(' expr ')'
     ;

The IFREF construct is optional and will be treated as "extra credit." The LOW and HIGH instructions need not exist for your assembler, but your handling of CONST should make it variable length dependent on the value of the constant. You should use conventional multi-pass processing to determine the length of each CONST. You may either go long-to-short or short-to-long, but short-to-long will work better. Allow at least as many as 20 "pass 1" before reporting failure if things keep changing; once they stop changing, do a single "pass 2" to generate code. To simplify making multiple passes, you should buffer the input in a character array with at least 64K byte capacity.

The code you generate is split into three segments which you should output to stdout in the order CODE, MONO, and finally POLY. Each of these will need its own location counter and can be assumed to never exceed 64K 32-bit words in length. Thus, your "pass 2" can write values into these arrays and only output the valid contents of the arrays just before the assembler exits. If you think about it, you can implement "pass 1" and "pass 2" as both saving output in these arrays and only printing the output after "pass 2." However, warning/error messages should only happen during "pass 2." The only warning you must handle is that everywhere ',' appears in the grammar, it is to be assumed if missing from the input.

Submitting Your Project

For your project, you should submit the usual type of tarball, containing:

After you have registered with the server, submit the tarball here:

Your email address is .

Your password is .

Which type of student are you?
Undergraduate registered for EE599-004
Graduate registered for EE699-001

Although this is not a secure server, users are bound by the UK code of conduct not to abuse the system. Any abuses will be dealt with as serious offenses.


http://aggregate.org/STCH/ Software Tools for Custom Hardware