Instruction set design is hard. Prof. Dietz has designed dozens of instruction sets in the three decades he's been a professor, and it still isn't easy for him to get things right. Thus, rather than giving you complete freedom to design your own instruction set, we're going to walk through the design logic for a reasonably well-crafted one that he built specifically for Fall 2017 EE480. However, this design is not complete -- each student must devise their own encoding of the instructions implement their own assembler.
"Logick" is the archaic spelling of "logic" in English. However, this isn't really an archaic design... the name is because it supports Logarithmic Number System (LNS) arithmetic -- ick! LNS is a very useful alternative to the floating-point arithmetic you know, using a simpler structure than IEEE floats to deliver higher accuracy, while making multiply and divide simple and fast. The catch is that add and subtract are a bit of a nightmare in LNS. Oh well... there really isn't any free lunch. Anyway, Logick is what our target machine will be for Fall 2017 EE480.
The machine has sixteen 16-bit registers, 16-bit datapaths, and 16-bit addresses, and each address in memory holds one 16-bit word. It can operate on both 16-bit integers and 16-bit LNS values. can operate on data as either a single 32-bit value or a four-element vector of 8-bit values. Thus, data memory (i.e., the .data segment) looks like an array of 32-bit data objects in which adding 1 to an address gets you the next 32-bit data object. Instruction memory (i.e., the .text segment) is different; each instruction is 16 bits long and adding one to an instruction address gets you the next 16-bit instruction. For simulation purposes, you should assume each of the .data and .text segments can hold 65536 of their size "words."
This instruction set is complete enough that I hope to be giving you a compiler (including full C source code) that translates programs written in a significant subset of C into Logick code. It's not a particularly smart compiler (ok, it's really dumb), but it will show you how Logick can be used for complete programs.
Logick's instruction set is quite straightforward, a general-register model encoding each instruction as a single 16-bit word. Although implementing the LNS operations isn't very familiar (ok, it's actually really complex for add and subtract), the bulk of the operations are really quite ordinary -- and so is the assembly langauge:
Instruction | Description | Functionality |
---|---|---|
ad $d, $s, $t | ADd integers | $d = $s + $t |
al $d, $s, $t | Add Log numbers | $d = $s + $t |
an $d, $s, $t | bitwise ANd integers | $d = $s & $t |
br c, lab | BRanch conditionally to label (encode lab as 8-bit lab-(PC+1)) | if c then PC = lab |
cl $s, $t | Compare Log numbers | condition codes set by $s vs. $t |
co $s, $t | COmpare integers | condition codes set by $s vs. $t |
dl $d, $s, $t | Divide Log numbers | $d = $s / $t |
eo $d, $s, $t | bitwise Exclusive Or integers | $d = $s ^ $t |
jr c, $d | Jump conditionally to Register | if c then PC = $d |
li $d, i8 | Load Immediate 8-bit integer | $d = signed_extend(i8) |
lo $d, $s | LOad integer | $d = memory[$s] |
ml $d, $s, $t | Multiply Log numbers | $d = $s * $t |
mi $d, $s | MInus | $d = - $s |
nl $d, $s | Negate Log number | $d = - $s |
no $d, $s | logical NOt (zero becomes 1, non-zero becomes 0) | $d = ! $s |
or $d, $s, $t | bitwise OR integers | $d = $s | $t |
si $d, i8 | Shift In 8-bit integer | $d = (($d) << 8) | (i8 & 0xff) |
sr $d, $s, $t | Shift Right signed integers | $d = $s >> $t |
st $d, $s | STore integer | memory[$s] = $d |
sy | SYstem (system call; end execution) | halt |
Determining how to encode the above instructions as bit patterns is a key part of your project. However, there are a few rules:
The Logick processor has two different types of registers: general-purpose registers and condition code registers.
There are 16 general-purpose registers, some of which have special purposes -- a lot like MIPS. They all have names as well as numbers. Perhaps the best way to give both is the following specification (formatted as an AIK specification):
.const {zero sp fp ra rv u10 u9 u8 u7 u6 u5 u4 u3 u2 u1 u0 }
Registers $u10 through $u0 (aka, registers $5 through $15) are "user" registers to be used in any way the programmer sees fit. However, it is expected that the assembler or compiler would use registers starting at $u10 for "internal" things and starting at $u0 for normal coding. The first five registers have special meanings:
Register Number | Register Name | Read/Write? | Use |
---|---|---|---|
$0 | $zero | Read Only | ZERO; constant 0x0000 |
$1 | $sp | Read/Write | the Stack Pointer |
$2 | $fp | Read/Write | the Frame Pointer |
$3 | $ra | Read/Write | the Return Address |
$4 | $rv | Read/Write | the Return Value |
Note that use of $0 (or $zero) as the destination for a result is illegal. Thus, if you wish, you could use those bit patterns for other things. In other words, ad $0, ..., ... is illegal, so you could use that bit pattern for something else, such as sy or even ne $d, $s. You'll have to be slightly clever to cram all these instruction formats into a structure that can't always reserve more than 4 bits for the opcode, but there are lots of ways to solve this problem. Simpler ways are better. :-)
There appear to be eight 1-bit condition registers that can only be written by executing a co (compare) instruction. They are:
f lt le eq ne ge gt t
These names (or the values they imply) are to be used in the br and jr instructions to indicate the desired condition. For example, to branch to place only if the condition codes indicate "Greater than or Equal to" is true, you would write the instruction as br ge, place.
However, you have the free choice of what values should encode the choice of which of these eight conditions should be applied. You could encode each as a 3-bit value from 0 to 7 and then simply use the condition value to index the appropriate one of the eight condition code registers. Then again, there don't need to be eight condition code registers at all. All the conditions listed above can be derived from just two one-bit actual condition code registers: one set for "Less Than" and one set for "Greater Than." For example, the eq condition would then be checking that both "Less Than" and "Greater Than" are 0.
The plan is for the Logick C-subset compiler to understand four different base data types: char, short, int, and lognum. All of those data types are treated as being signed. It should come as no surprise that char, short, and int are really fully equivalent data types: each is a 16-bit value encoded in 2's complement. A lognum is also encoded in a 16-bit word, although it has a somewhat different internal structure that allows it to behave a lot like a signed floating-point value. The thing you need to know is that the Logick assembler does not need to understand how a lognum is encoded. In other words, a lognum constant in Logick assembly langauge is written as the integer value that produces the desired 16-bit bit pattern.
In your hardware implementations, you'll have a free choice to have two separate or one shared memory space for code (.text) and data (.data), but your assembler should provide for both segments with a word size of 16 bits, 0x10000 addresses each with a default start address of 0, and generating output machine code in Verilog memory image format. Also be sure to force .lowfirst to be 0 so that bits are packed into 16-bit words starting with the MSB working down.
Your project is simply to design the instruction set encoding and implement an assembler using AIK. Here's a simple test case:
.text .origin 0x0000 start: ad $1,$2,$3 al $4,$5,$6 an $sp,$fp,$zero br lt,start cl $ra,$rv co $u0,$u1 dl $u2,$u3,$u4 eo $u5,$u6,$u7 jr ge,$u8 li $u9,-1 lo $u10,$u0 .data .origin 0x0100 fluff: .word 42 .text ; continue where we left off ml $u0,$u1,$u2 mi $7,$8 ; was ne instruction nl $9,$10 no $11,$12 or $13,$14,$15 si $u0,42 sr $u0,$u0+1,$0 st $u0,$u1 sy la $1,42 ; just an li la $1,fluff ; forces li, si la $1,-2 ; just an li
No, the above isn't a useful program. Worse still, I obviously can't show you sample output without giving-away how I've encoded the instructions....
The recommended due date for this assignment is before class, Friday, September 22, 2017. This submission window will close when class begins on Monday, September 25, 2017. You may submit as many times as you wish, but only the last submission that you make before class begins on Monday, September 25, 2017 will be counted toward your course grade. The deadline has been now been extended by one class period -- to before class Wednesday, September 27, 2017 -- because of the accidental conflict involving use of ne for both negate and "not equal"; the instruction is now called mi (for minus).
Note that you can ensure that you get at least half credit for this project by simply submitting a tar of an "implementor's notes" document explaining that your project doesn't work because you have not done it yet. Given that, perhaps you should start by immediately making and submitting your implementor's notes document? (I would!)
For each project, you will be submitting a tarball (i.e., a file with the name ending in .tar or .tgz) that contains all things relevant to your work on the project. Minimally, each project tarball includes the source code for the project and a semi-formal "implementors notes" document as a PDF named notes.pdf. It also may include test cases, sample output, a make file, etc., but should not include any files that are built by your Makefile (e.g., no binary executables). For this particular project, name the AIK source file logick.aik.
Submit your tarball below. The file can be either an ordinary .tar file created using tar cvf file.tar yourprojectfiles or a compressed .tgz file file created using tar zcvf file.tgz yourprojectfiles. Be careful about using * as a shorthand in listing yourprojectfiles on the command line, because if the output tar file is listed in the expansion, the result can be an infinite file (which is not ok).