Escolar Documentos
Profissional Documentos
Cultura Documentos
Structural (or resource) hazards occur when the system does not have sufficient resources to handle a particular
combination of successive instructions, e.g. too much data required to/from cache simultaneously, or three
successive ADDs on a machine with only two ALUs
Control hazards (aka procedural dependency) are caused by branch instructions, especially conditional
branches. We cannot execute instructions after the branch in parallel with instructions before it. Moreover, if
the instruction length is not fixed instructions must be decoded to find how how many fetches are needed
Instruction Issue/Completion Policy
With superscalar architectures we have the potential for beginning issuing and/or completing (retiring)
instructions either in or out of order
To understand why these are attractive options, consider the following code:
add r1, r3, r5
and r4, 0x7f, r3
sub r6, r12, r6
load Fred,,r9
There are no data dependencies, but if we only have two ALUs then we shall have to stall the pipeline when we
fetch the sub instruction.
Issuing instructions out of order would mean that the CPU could fetch and begin work on the load instruction which
would not involve the ALU.
1.
2.
3.
4.
Doing everything in order is the simplest approach, but the slowest. We may need to stall the pipeline, as noted
above
As soon as we either issue or retire instructions out of order, the CPU is involved with considerable book-keeping
overhead in order to ensure correctness.
One common technique, known as scoreboarding was pioneered in what many consider to have been the first
superscalar machine, Seymour Cray's CDC6600, in 1964.
Examples of the problems which can arise include:
Output dependencies
add r3, r5, r3
add r3, 1, r4
add r5, 1, r3
If the third instruction completes before the first, the the value in R3 will be incorrect.
Antidependencies
add r3, r5, r3
add r3, 1, r4
add r5, 1, r3
add r3, r4, r7
The third instruction must not complete before the second fetches its operands
Both output dependencies and antidependencies occur because changing register contents may not reflect the
original program sequence. This may result in a pipeline stall, wasting clock cycles
o
Register Renaming
One form of resource duplication is register renaming, a technique whereby physical registers are dynamically
allocated by the CPU. This also requries register-name to physical register mapping.
add r3a, r5a, r3b
add r3b, 1, r4b
add r5a, 1, r3c
add r3c, r4b, r7b
Without register renaming instruction three cannot be issued before instruction one has completed and instruction
two has been issued.
With renaming instruction three can be issued immediately. As instructions are retired their destination registers
become the "real" ones.
o
0.
1.
o
Interrupts
If out-of-order completion is allowd, what PC value do we save at interrupt time, in order to ensure that:
Instructions are not repeated on restarting the program
Instructions are not missed on restarting the program
Superscalar Processors
CPU with single pipeline is called a scalar processor
CPU with multiple pipelines are called superscalar
Superscalar CPU can execute more than one instruction per clock cycle, thus a superscalar processor with 3
functional units (i.e. 3 pipelines) will (in theory) run 3 times faster than a scalar processor with the same clock
speed