Escolar Documentos
Profissional Documentos
Cultura Documentos
max i t1 t m t1
i 1
1.
2.
3.
4.
Stages:
S1
B = b x 2q
A
Other
fraction
Exponent
subtractor
Fraction
selector
Fraction with min(p,q)
r = max(p,q)
t = |p - q|
Right shifter
Fraction
adder
c
S2
r
Leading zero
counter
S3
c
Left shifter
r
d
S4
Exponent
adder
C= X + Y = d x 2s
Mantissas
b
R
Segment 1:
A
Difference=3-2=1
For example:
X=0.9504*103
Y=0.8200*102
Align mantissas
Choose exponent 3
R
Adjust
exponent
0.082
R
Add
mantissas
Segment 3:
Segment 4:
Compare
exponents
by subtraction
R
Segment 2:
S=0.9504+0.082=1.0324
Normalize
result
0.10324
Performance Parameters
Speedup
Speedup is defined as
Speedup = Time taken for a given computation by a non-pipelined functional unit
Time taken for the same computation by a pipelined version
Speed-up
For e.g., if a pipeline has 4 stages and 5 inputs,
its speedup factor is
Speedup = ?
Efficiency
It is an indicator of how efficiently the
resources of the pipeline are used.
If a stage is available during a clock period,
then its availability becomes the unit of
resource.
Efficiency can be defined as
Efficiency =
Efficiency
Efficiency
No. of stage time units = nk
there are n inputs and each input uses k stages.
Throughput
It is the average number of results computed
per unit time.
For n inputs, a k-staged pipeline takes
[k+(n-1)]T time units
Then,
Throughput = n / [k+n-1] T = nf / [k+n-1]
where f is the clock frequency
Throughput = Efficiency x Frequency
Point no 2
Classification of Pipelining
Handlers Classification
Based on the level of processing, the pipelined
processors can be classified as:
1. Arithmetic Pipelining
2. Instruction Pipelining
3. Processor Pipelining
Arithmetic Pipelining
The arithmetic logic units of a computer can
be segmented for pipelined operations in
various data formats.
Example : Star 100
Arithmetic Pipelining
Instruction Pipelining
The execution of a stream of instructions can
be pipelined by overlapping the execution of
current instruction with the fetch, decode
and operand fetch of the subsequent
instructions
It is also called instruction look-ahead
Processor Pipelining
This refers to the processing of same data
stream by a cascade of processors each of
which processes a specific task
The data stream passes the first processor
with results stored in a memory block which
is also accessible by the second processor
The second processor then passes the refined
results to the third and so on.
Processor Pipelining
Unifunctional Pipelines
A pipeline unit with fixed and dedicated
function is called unifunctional.
Example: CRAY1 (Supercomputer - 1976)
It has 12 unifunctional pipelines described in
four groups:
Address Functional Units:
Address Add Unit
Address Multiply Unit
Unifunctional Pipelines
Scalar Functional Units
Unifunctional Pipelines
Floating Point Functional Units
Floating Point Add Unit
Floating Point Multiply Unit
Reciprocal Approximation Unit
Multifunctional
A multifunction pipe may perform different
functions either at different times or same
time, by interconnecting different subset of
stages in pipeline.
Example 4X-TI-ASC (Supercomputer - 1973)
Static Pipeline
It may assume only one functional
configuration at a time
Static pipelines are preferred when
instructions of same type are to be executed
continuously
A unifunction pipe must be static.
Dynamic pipeline
It permits several functional configurations to
exist simultaneously
A dynamic pipeline must be multi-functional
The dynamic configuration requires more
elaborate control and sequencing mechanisms
than static pipelining
Scalar Pipeline
It processes a sequence of scalar operands
under the control of a DO loop
Instructions in a small DO loop are often
prefetched into the instruction buffer.
The required scalar operands are moved into
a data cache to continuously supply the
pipeline with operands
Example: IBM System/360 Model 91
Vector Pipelines
They are specially designed to handle vector
instructions over vector operands.
Computers having vector instructions are called
vector processors.
The design of a vector pipeline is expanded from
that of a scalar pipeline.
The handling of vector operands in vector pipelines is
under firmware and hardware control.
Example : Cray 1
Point no 3
Generalized Pipeline and
Reservation Table
Sa
Output B
Sb
Sc
Reservation Table
Each functional evaluation can be represented
using a diagram called Reservation Table(RT).
It is the space-time diagram of a pipeline
corresponding to one functional evaluation.
X axis time units
Y axis stages
Reservation Table
For first sequence Sa, Sb, Sc, Sb, Sc, Sa called
function A , we have
Sa
Sb
Sc
0
A
A
A
5
A
Reservation Table
For second sequence Sa, Sc, Sb, Sa, Sb, Sc
called function B, we have
Sa
Sb
Sc
0
B
2
B
3
B
B
B
Output B
Sa
Sc
Reservation Table
Time
Stage
0
Sa
Sb
Sc
Sb
Function A
Output B
Sa
Sb
Sc
Reservation Table
Time
Stage
Sa
Sb
Sc
0
A
Output B
Sa
Sc
Reservation Table
Time
Stage
Sa
Sb
Sc
Sb
0
A
1
A
Output B
Sa
Sc
Reservation Table
Time
Stage
Sa
Sb
Sc
Sb
0
A
A
A
Output B
Sa
Sc
Reservation Table
Time
Stage
Sa
Sb
Sc
Sb
0
A
3
A
Output B
Sa
Sc
Reservation Table
Time
Stage
Sa
Sb
Sc
Sb
0
A
A
A
Output B
Sa
Sc
Reservation Table
Time
Stage
Sa
Sb
Sc
Sb
0
A
A
A
5
A
Function B
Output B
Sa
Sc
Reservation Table
Time
Stage
Sa
Sb
Sc
Sb
0
B
Output B
Sa
Sc
Reservation Table
Time
Stage
Sa
Sb
Sc
Sb
0
B
Output B
Sa
Sc
Reservation Table
Time
Stage
Sa
Sb
Sc
Sb
0
B
2
B
Output B
Sa
Sc
Reservation Table
Time
Stage
Sa
Sb
Sc
Sb
0
B
2
B
3
B
Output B
Sa
Sc
Reservation Table
Time
Stage
Sa
Sb
Sc
Sb
0
B
2
B
3
B
4
B
Output B
Sa
Sc
Reservation Table
Time
Stage
Sa
Sb
Sc
Sb
0
B
2
B
3
B
B
B
Reservation Table
After starting a function, the stages need to be
reserved in corresponding time units.
Each function supported by multifunction
pipeline is represented by different RTs
Time taken for function evaluation in units of
clock period is compute time.(For A & B, it is
6)
Reservation Table
Marking in same row => usage of stage more
than once
Marking in same column => more than one
stage at a time