Spring 2021 EE380 Assignment 5 Solution

  1. For this question, check all that apply. Which of the following things would you expect to find within the processor chip for a multi-core laptop processor?
    TLB
    L1 Cache
    L2 Cache
    L3 Cache
    Main memory
    Not in a laptop, but in some SOCs (system on a chip)
  2. For this question, check all that apply. Which one of the following four statements about the memory hierarchy are true?
    Larger capacity caches tend to be slower
    Larger cache line sizes take better advantage of Spatial Locality
    Modern processors often have separate caches for instructions and data
    Temporal Locality refers to an object being likely to be referenced again soon after being referenced once
    For comparable cache size, a direct mapped cache is easier to build (simpler logic) than set associative cache
  3. For this question, check all that apply. Remember this diagram of the AMD Athlon? According to the diagram, which of the following techniques is used in this design?

    Fully associative cache
    Nothing here suggests that
    History-based branch prediction
    Separate L1 caches for code and data
    Superscalar execution of integer arithmetic
    Instruction scheduling with register renaming
    As discussed in class, the renaming is obvious given the huge number of FP registers
  4. Suppose that a simple system has a single cache with an access time of 1 clock cycle. Cache misses are satisfied with an average memory latency of 200 clock cycles. Assuming a cache hit ratio of about 0.99 (99%), roughly how long does the average reference take? Show the formula that would give the answer.
  5. Given how modern memory systems work, and assuming N is a big number, which of the following would you expect to run faster or would they be about the same? Be sure to explain your reasoning why. Choice A:
    struct { int a, b, c; } abc[N];
    for (int i=0; i<N; ++i) { abc[i].a = 0; }
    

    Or Choice B:
    int a[N]; int b[N]; int c[N];
    for (int i=0; i<N; ++i) { a[i] = 0; }
    

  6. Which one of the following three I/O mechanisms would be most appropriate for a desktop PC to use in reading keystrokes from a keyboard?
    Polling
    Wastes too much processor time
    Interrupts
    DMA
    Not much data to move, so no need for DMA
  7. For this question, mark all answers that apply.
    Which of the following statements about the memory hierarchy are true?
    The address used to search the L2 cache is usually a physical memory address
    L1 and lower are usually physical addresses
    It is possible to suffer a TLB miss for a reference to a datum that is already in cache
    Fewer TLB entries than cache buckets implies this is possible
    If a program repeatedly accesses the same few variables, it has high temporal locality
    Pretty much the definition...
    LRU is a common replacement policy; it replaces the line that hasn't been accessed for the longest time
    Often an LRU approximation, but yes
    The content of a dirty line is potentially different from that of the same address in lower levels of the memory hierarchy
    Again, pretty much the definition

  8. For this question, mark all answers that apply according to the following MIPS pipeline diagram:

    Consider executing the following code MIPS sequence:

    A:	andi	$t1, $t0, 47
    B:	and	$t3, $t2, $t1
    C:	andi	$t4, $t0, 1541
    D:	sw	$t4, 3276($t5)
    E:	xor	$t0, $t5, $t2
    F:	lw	$t1, 6356($t5)
    

    This code is to be executed on a pipelined MIPS implementation like that shown in the reference diagram. Unless stated otherwise, assume value forwarding is not implemented. Which of the following statements are true?
    There is a true dependence (RAW) between instructions A and B
    On $t1
    There is an output dependence (WAW) between instructions D and F
    Nope - don't write the same regs nor memory locations
    Adding value forwarding to the pipeline would result in no pipeline bubbles for this code
    Only lw is a problem forwarding doesn't fix
    Without value forwarding, the code would execute in less time if instruction C were moved to between A and B
    Remember that dependence between A and B?
    As written, instruction E couldn't move to before C, but it could if we renamed register $t0 with $t6 in instruction E
    Classic fix for WAR


EE380 Computer Organization and Design.