EE380 Assignment 2 Solution

  1. 100%, A particular program expressed in a particular ISA executes 100 ALU instructions, 10 Loads, 8 Stores, and 2 Branches. A simple, non-pipelined, implementation of that ISA takes 7 CPI for each ALU instruction, 20 CPI for each load, 10 CPI for each Store, and 10 CPI for each Branch. The original clock frequency is 2GHz. How many clock cycles would the program take to execute? How many nanoseconds would the program take to execute?
  2. 100%, Given the circumstances described in question 1 above, which of the following three changes would yield at least 2X speedup?
    A new compiler reduces the number of ALU instructions from 100 to 20
    An improved design reduces the CPI for ALU instructions from 7 to 2
    This was a typo -- it was to say 7 to 3 -- but this is also correct as written
    New VLSI fabrication technology changes the clock period to 1ns
    Any of the above three would suffice to give at least 2X speedup
    None of the above three would suffice to give at least 2X speedup
  3. 100%, What is a synthetic benchmark and why might such a thing be useful?
  4. 100% Given this processor hardware design and the control sequence below, describe in words (or C-like pseudo code) the function of the instruction xyzzy immed(rt).
    when op(0x3f) op(1) Xyzzy
    
    Start:
     PCout, MARin, MEMread, Yin
     CONST(4), ALUadd, Zin, UNTILmfc
     MDRout, IRin
     Zout, PCin, JUMPonop
     HALT /* Should end here on undecoded op */
    
    Xyzzy:
     SELrt, REGout, Yin
     IRimmedout, ALUadd, Zin
     Zout, MARin, MEMread
     CONST(-1), Yin, UNTILmfc
     MDRout, ALUxor, Zin
     Zout, MDRin, MEMwrite, JUMP(Start)
    

  5. 100% Given the xyzzy immed(rt) instruction as defined above, and assuming that a memory load request takes 4 clock cycles to complete (after MEMread has been issued), how many CPI would each xyzzy instruction require? You may use the simulator to get or check your answer. In any case, give and briefly explain your answer here:
  6. 100% Given this processor hardware design, suppose that the following control state is the limiting factor in determining the maximum clock speed. Given that the propagation delay associated with SELrs is 8ns, REGout is 4ns, MDRin is 2ns, ALUadd is 16ns, and Zin is 1ns, what is the period (in nanoseconds) of the fastest allowable clock? You may use the simulator to get or check your answer. In any case, give and briefly explain your answer here:
    SELrs, REGout, MDRin, ALUadd, Zin
    

  7. 100%, A particular program consists of two functions, a() and b(). Initially, a() takes 750 clock cycles and b() takes 250 clock cycles. What is the maximum possible overall speedup that could be obtained by making changes that only affect the execution speed of a()?


EE380 Computer Organization and Design.