Você está na página 1de 15

Solution to Quiz 1

Daniel Orozco
University of Delaware
http://www.udel.edu
Computer Architecture and
Parallel Systems Laboratory
http://www.capsl.udel.edu
Flynns Taxonomy
What are the major types of architectures
according to Flynn's taxonomy?
SISD
SIMD
MISD
MIMD
4/5/2010 2 Daniel Orozco - Solution to Quiz 1
Amdahls Law
A particular scientific program has an
initialization stage that is serial and that takes
5 seconds to run. The remaining part of the
program, when run serially takes 60 seconds,
and it can be easily and fully parallelized.
a) What is the maximum speedup that you can
get if you use 4 processors?
Answer = ( 5 + 60 ) / ( 5 + 60/4 )
4/5/2010 Daniel Orozco - Solution to Quiz 1 3
Amdahls Law
b) What is the maximum speedup that you can get with
infinite processors?
Answer = ( 5 + 60 ) / ( 5 + 60/infinite ) = 65/5
c) Assume that the serial part remains constant as you
increase the problem size. By what number do you
have to multiply the problem size to obtain a speedup
of 17 with 20 processors?
Answer: 17 = (5+60X)/( 5 + 60X/20 )
X = 8.88
4/5/2010 Daniel Orozco - Solution to Quiz 1 4
Memory Models
a) What is the weakest memory model that you
can formulate?
No Rules. Empty air, for example.
b) Why is it good to have a weak memory model?
It is easy to build the machine.
c) Why is it good to have a strong memory model?
It is easy to write a parallel program.
4/5/2010 Daniel Orozco - Solution to Quiz 1 5
Parallel Program
processor_1(....)
{
A = 5;
B = 0;
if ( B == 3 )
{
if ( A == 5 )
{
B = 4;
}
}
barrier_wait(...)
printf( "B=%d\n", B );
}
processor_2(...)
{
A = 3;
B = 3;
barrier_wait(...)
}
The formulation has an error. The possible sequence of
instructions is possible:
A = 3 (Processor 2)
A = 5 (Processor 1)
B = 0 (Processor 1)
B = 3 (Processor 2 )
if ( B == 3 )
if ( A == 5 )
B = 4
Barrier
printf.
How to solve it?
Think of all the possible reads and writes.
My apologies. 3 people reviewed the exam and none of them
found the error.
You were graded based on your understanding of the
problem, so that this error did not affect you.
4/5/2010 Daniel Orozco - Solution to Quiz 1 6
Test and Set Lock
Question
Why is, in general, a test and
set lock very slow when
compared to more
advanced algorithms such
as the array lock?
Answer
Because all processors
continuously write to the
same memory location. This
causes many cache
invalidations and bus
contention. They are very
slow.
4/5/2010 Daniel Orozco - Solution to Quiz 1 7
Isolation
Question?
AddItemToQueue: Atomic
DeleteItemFromQueue: Atomic
MoveItemBetweenQueues: ???
MoveItemBetweenQueues( ... )
{
QueueItem_t item;
item = DeleteItemFromQueue(...)
AddItemToQueue( ...,item, ... )
}
Answer
Not atomic.
With an atomic move, the
element is always in some
queue.
With the implementation
shown, there is a moment
where the item is not in the
queue, it is in a local
variable.
4/5/2010 Daniel Orozco - Solution to Quiz 1 8
BarrierX
Summary of Question:
Use a BarrierX primitive to
implement a barrier for 2^N
processors in a very
efficient way.
Solution
Use a butterfly barrier:
4/5/2010 Daniel Orozco - Solution to Quiz 1 9
1 2 3 4 5 6 7
R0
R1
R2
R3
Comments about BarrierX
A tree barrier or a tournament barrier do not
work: Because the leafs of the tree are only
synchronized with the trunk, not with other
leafs.
4/5/2010 Daniel Orozco - Solution to Quiz 1 10
Comments about BarrierX
The BarrierX primitive is the only operation
given.
You can not use:
Wait
Signal
Awake
Others that could be implemented using atomic
operations but were not BarrierX
4/5/2010 Daniel Orozco - Solution to Quiz 1 11
Software Pipelining
Question
W hat architecture features
increase the effectiveness
of software pipelining?
Answer
Multiple Issue
Speculative Execution
Branch Prediction for Software
Pipeline
Multiple function units
Pipelined Units
Many registers
Rotating Registers
Predicated Execution, and
many others
4/5/2010 Daniel Orozco - Solution to Quiz 1 12
Software Pipelining
Question
Can you benefit from the use
of software pipelining in an
architecture that is single
issue and that does not
have pipelining?
If no, Why? If Yes, please
explain what advantages
you will obtain by using
software pipelining.
Answer
Yes.
Mostly latency hiding.
If loads and stores are
software pipelined, the total
execution time will be
slower.
There are other advantages
that require additional
hardware support.
4/5/2010 Daniel Orozco - Solution to Quiz 1 13
Transactional Memory
Question
What are the names of the
ACID properties of
Transactional Memory?
Answer
Atomicity
Consistency
Isolation
Durability
4/5/2010 Daniel Orozco - Solution to Quiz 1 14
The Grades
4/5/2010 Daniel Orozco - Solution to Quiz 1 15