Levels Tought:
Elementary,Middle School,High School,College,University,PHD
Teaching Since: | Apr 2017 |
Last Sign in: | 9 Weeks Ago, 2 Days Ago |
Questions Answered: | 4870 |
Tutorials Posted: | 4863 |
MBA IT, Mater in Science and Technology
Devry
Jul-1996 - Jul-2000
Professor
Devry University
Mar-2010 - Oct-2016
Consider a basic in-order pipeline with bypassing (one instruction in each pipeline stage in any cycle). The pipeline has been extended to handle FP add. Assume the following delays between dependent instructions:
ï‚· Load feeding any instruction: 3 stall cycles
ï‚· FP ALU feeding any instruction (except stores): 5 stall cycles
ï‚· FP ALU feeding store: 4 stall cycles
ï‚· Int add feeding a branch: 2 stall cycles
ï‚· Int add feeding any other instruction: 1 stall cycle
ï‚· A conditional branch has 1 delay slot (an instruction is fetched in the cycle after the branch without knowing the outcome of the branch and is executed to completion)
Below is the source code and default assembly code for a loop.
Source Code: for (i=1000; i>0; i) { w[i] = x[i] + y[i] + z[i]; }
Assembly Code: Loop: L.D F1, 0(R2) ; Get x[i] L.D F2, 0(R3) ; Get y[i] L.D F3, 0(R4) ; Get z[i] ADD.D F4, F2, F1 ; Add two numbers ADD.D F5, F3, F4 ; Add the third number S.D F5, 0(R5) ; Store the result into w[i] DADDUI R2, R2, #8 ; Decrement R2 DADDUI R3, R3, #8 ; Decrement R3 DADDUI R4, R4, #8 ; Decrement R4 DADDUI R5, R5, #8 ; Decrement R5 BNE R2, R1, Loop ; Check if we've reached the end of the loop NOP
a) (4 Points) Show the schedule (what instruction issues in what cycle) for the default code. How many cycles per iteration in this case?
b) (6 Points) How should the compiler order instructions to minimize stalls (without unrolling) (note that the execution of a NOP instruction is effectively a stall)? Show the schedule. How many cycles per iteration in this case? How many cycles
can you save per iteration, compared to the default schedule? Assume that R1,
R2, R3, R4, R5 have all been initialized appropriately before entering the loop.
c) (6 Points) How many times must the loop be unrolled to eliminate stall cycles?
Show the schedule for the unrolled code. How many cycles per iteration in this
case? How many cycles can you save per iteration, compared to the default
schedule?
d) (10 Points) Come up with a software-pipelined version of the code (no unrolling)
(you need not show the schedule, but you do need to use appropriate register
names and displacements).