Assignment 1: TACKY Encoding And Assembler

Instruction set design is hard. Prof. Dietz has designed dozens of instruction sets in the three decades he's been a professor, and it still isn't easy for him to get things right. Thus, rather than giving you complete freedom to design your own instruction set, we're going to walk through the design logic for a reasonably well-crafted one that he built specifically for Spring 2019 CPE480. However, this design is not complete -- each student must devise their own encoding of the instructions implement their own assembler using AIK: the Assembler Interpreter from Kentucky (which is also discussed here).

Feeling TACKY?

The instruction set for this semester is a rather strange thing called TACKY: the twin accumulator computer from Kentucky. Basically, it's a 16-bit VLIW machine and the two accumulators are just a sneaky way to compress the encoding of two instructions in one instruction word. Field 0 implicitly works on register $0 and field 1 implicitly works on register $1. The details of the instruction set are presented here.

TACKY is a completely 16-bit machine. Everything is 16 bits wide: instruction words, addresses, data, and each of the eight registers. It isn't even byte-addressed for memory. However, it is a fairly beefy processor in that it even supports floating-point arithmetic. Well, 16-bit floats. In sum, it's small enough to be feasibly implemented inside a single not-too-exotic FPGA... which we will not make you do, but again, is a nice possibility for the future.

TACKY Assembly Language

I'm not going to repeat TACKY's instruction set here. Instead, again I'll simply point you at this TACKY reference. You will need to read and understand that document very thoroughly. Here are some things to keep in mind:

  1. Most types of instruction can be paired in a single instruction word. The assembly language specification thus allows two instructions to be specified on the same line, separated by a comma. For example, add $5, sub $6 needs to be matched as one pattern in the AIK specification you'll create. Honestly, AIK wasn't designed to handle VLIW encoding... but here's how it works. If you had an aliased name, .inst, that could be either add or sub, then you can match the VLIW pair of instructions with an AIK pattern like:
    .inst $.r0 , .inst $.r1
    

    The catch is that for forming the encoded instruction word, AIK only understands one opcode per line -- the first symbol. Thus, whatever the value of the thing matching the first .inst is, that will be available to you for the encoding as the value of .this. Ok, but what about the second .inst? Well, it turns out that if you refer to the value of .inst, you'll get the value associated with the second one. Here's a dumb little example:
    .inst $.r0 , .inst $.r1 := .this:4 .r0:4 .inst:4 .r1:4
    .alias .inst a b c d
    

    Given the assembler input:
    z:	a $5, b $6
            c $3, d $2
            b $4, a $4
    	c $3, 42-39 $2
    

    You'll get output like the following (note that d is 3, so the second and third lines are equivalent; AIK doesn't bother checking that the second opcode name is actually an opcode name):
    //generated by AIK version 20180920
    @0000
    0516
    2332
    1404
    2332
    //end
    
  2. In your hardware implementations, you'll have a free choice to have two separate or one shared memory space for code (.text) and data (.data), but your assembler should provide for both segments with a word size of 16 bits, 0x10000 addresses each, with a default start address of 0, and generating output machine code in Verilog memory image format. It isn't absolutely necessary, but I'd also suggest that you leave .lowfirst as 0 to keep things packing from the msot significant bits downward.
  3. Data comes in either of two 16-bit types: 2's complement signed integers or 16-bit (half precision) floating-point values. The AIK assembler understands 16-bit integers and .word can be used to initialize data in memory, but AIK doesn't understand float data. That's ok -- floating-point constants simply need to be entered as integer values with the same bit pattern. Here's a CGI form that lets you enter a floating-point value and will show you the 16-bit integer value that represents it in hexadecimal. For example, 1.0 is represented by 0x3f80.
  4. Don't forget to define the register names using .const.
  5. There are five "macros" that your assembler should recognize: cf, ci, jnz, jp, and jz. Each of these should be treated as a single pattern generating a 32-bit encoded output, i.e., a sequence of two 16-bit words. Note that you can't reference the other instructions directly in these patterns, so you just need to specify the values for 32 bits worth of fields for each macro.
  6. Some pairings of instructions are not legal. Basically, you can't do more than one load or store in an instruction word, and you can't have more than one write into the same register. For example, neither lf $5, st $3 nor not $5, a2r $0 is allowed. It might be nice to have the assembler complain about those things... but I don't expect you to do that.

It really isn't hard to deal with specifying the TACKY assembly language for AIK, but I strongly recommend you try to factor-out classes of instructions to single AIK specifications. For example, listing all possible pairs of instructions would otherwise produce a huge AIK specification. There are 17 types of pairable instructions, so there would be approximately 17*17 possible types of paired instructions -- there are over 250 different types of paired instructions!

Your Project

Your project is simply to design the instruction set encoding and implement an assembler using AIK. Here's a simple test case:

	.text
	.origin	0x0000
start:	a2r	$2,	add	$3
	and	$4,	cvt	$5
	cf8	$6, here+0
	ci8	$7, here+1
	div	$r0,	jr	$r1
	jnz8	$r2, here+2
	jp8	here+3
	.data		; switch to data segment
	.origin	0x0080
here:	.word	42
	.text		; continue where we left off
	jz8	$r3, here+3
	lf	$r4,	mul	$ra
	li	$rv,	not	$sp
	or	$0,	r2a	$1
	pre	here+4
	sh	$2,	slt	$3
	st	$4,	sub	$5
	sys	here+5
	xor	$6,	a2r	$7

No, the above isn't a useful program. Worse still, I obviously can't show you sample output without giving-away how I've encoded the instructions....

Due Dates

The recommended due date for this assignment is before class, Friday, February 15, 2019. This submission window will close when class begins on Monday, February 18, 2019. I do not recommend that you spend Valentine's Day working on this project. You may submit as many times as you wish, but only the last submission that you make before class begins on Monday, February 18, 2019 will be counted toward your course grade.

Note that you can ensure that you get at least half credit for this project by simply submitting a tar of an "implementor's notes" document explaining that your project doesn't work because you have not done it yet. Given that, perhaps you should start by immediately making and submitting your implementor's notes document? (I would!)

Submission Procedure

For each project, you will be submitting a tarball (i.e., a file with the name ending in .tar or .tgz) that contains all things relevant to your work on the project. Minimally, each project tarball includes the source code for the project and a semi-formal "implementors notes" document as a PDF named notes.pdf. It also may include test cases, sample output, a make file, etc., but should not include any files that are built by your Makefile (e.g., no binary executables). For this particular project, name the AIK source file tacky.aik.

Submit your tarball below. The file can be either an ordinary .tar file created using tar cvf file.tar yourprojectfiles or a compressed .tgz file file created using tar zcvf file.tgz yourprojectfiles. Be careful about using * as a shorthand in listing yourprojectfiles on the command line, because if the output tar file is listed in the expansion, the result can be an infinite file (which is not ok).

Your account is ... the alphanumeric ID you use with UK stuff, all uppercase
Your password is ... the last 4 digits of (SID+((int)(SID/10000)))


EE480 Advanced Computer Architecture.