Escolar Documentos
Profissional Documentos
Cultura Documentos
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
A codelsss introduction to GPU parallelism
Will Landau
Iowa State University
September 23, 2013
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 1 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Outline
A review of GPU parallelism
Examples of parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte Carlo
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 2 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
A review of GPU parallelism
Outline
A review of GPU parallelism
Examples of parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte Carlo
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 3 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
A review of GPU parallelism
The single instruction, multiple data (SIMD)
paradigm
Toy example:
_
a
1
a
2
.
.
.
a
n
_
_
b =
_
_
b
1
b
2
.
.
.
b
n
_
_
c
1
c
2
.
.
.
c
n
_
_
=
_
_
a
1
+ b
1
a
2
+ b
2
.
.
.
a
n
+ b
n
_
_
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 9 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism Vector addition
Vector addition
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 10 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism Vector addition
Vector addition
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 11 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism Vector addition
Vector addition
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 12 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism Pairwise summation
Pairwise summation
Reductions
Scans
Take an m n matrix, A = (a
ij
), and an n p matrix, B = (b
jk
).
Compute C = A B:
a
1.
.
.
.
a
m.
where
a
i .
=
a
i 1
a
in
b
.1
b
.p
where
b
.k
=
b
1k
.
.
.
b
nk
(a
1.
b
.1
) (a
1.
b
.p
)
.
.
.
.
.
.
.
.
.
(a
m.
b
.1
) (a
m.
b
.p
)
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 26 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism Matrix multiplication
Parallelizing matrix multiplication
Entry (i , k) of matrix C is
c
ik
= a
i 1
b
1k
. .
+a
i 2
b
2k
. .
+ + a
in
b
nk
. .
= c
i 1k
+ c
i 2k
+ + c
ink
n
j =1
c
ijk
as a pairwise sum.
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 27 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism Matrix multiplication
Matrix multiplication
1 2
8
3
1 2
8
5
1 2
7
2
1 5
8
3
1 5
8
5
1 5
7
2
7 9
8
3
7 9
8
5
7 9
7
2
_
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 28 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism Matrix multiplication
Matrix multiplication
_
8
3
_
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 30 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism Matrix multiplication
Matrix multiplication
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 31 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism Matrix multiplication
Matrix multiplication
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 32 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism K-means clustering
Lloyds K-means algorithm
The circles are the cluster means, the squares are the
data points, and the color indicates the cluster.
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 34 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism K-means clustering
Step 2: assign each data point (square) to its
closest center (circle).
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 35 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism K-means clustering
Step 3: update the cluster centers to be the
within-cluster data means.
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 36 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism K-means clustering
Repeat step 2: reassign points to their closest
cluster centers.
Synchronize threads.
Let:
y
k
= number of observed deaths in county k.
n
k
= the number of person-years in county k divided by 100,000.
k
= expected number of deaths per 100,000 person-years.
The model:
y
k
ind
Poisson(n
k
k
)
k
iid
Gamma(, )
Uniform(0, a
0
)
Uniform(0, b
0
)
k=1
[p(y
k
|
k
, n
k
)p(
k
| , )]p()p()
k=1
e
n
k
y
k
k
()
1
k
e
I (0 < < a
0
)I (0 < < b
0
)
k
p(
k
| y,
k
, , ) IN PARALLEL!
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 40 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism Markov chain Monte Carlo
Full conditional distributions
p(
k
| y,
k
, , ) p(, , | y)
e
n
k
y
k
k
1
k
e
=
y
k
+1
k
e
k
(n
k
+)
Gamma(y
k
+ , n
k
+ )
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 41 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism Markov chain Monte Carlo
Conditional distributions of and
p( | y, , ) p(, , | y)
k=1
_
1
k
()
_
I (0 < < a
0
)
=
_
K
k=1
k
_
K
()
K
I (0 < < a
0
)
p( | y, , ) p(, , | y)
k=1
_
e
I (0 < < b
0
)
=
K
e
K
k=1
k
I (0 < < b
0
)
Gamma
_
K + 1,
K
k=1
k
_
I (0 < < b
0
)
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 42 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism Markov chain Monte Carlo
Summarizing the Gibbs sampler
1. Sample from from its full conditional.
Draw the
k
s in parallel from independent
Gamma(y
k
+ , n
k
+ ) distributions.
k
from its Gamma(y
k
+ , n
k
+ ) distribution.
2. Sample from its full conditional using a random walk
Metropolis step.
3. Sample from its full conditional (truncated Gamma)
using the inverse cdf method if b
0
is low or a
non-truncated Gamma if b
0
is high.
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 43 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism Markov chain Monte Carlo
Preview: a bare bones CUDA C workow
#i n c l u d e <s t d i o . h>
#i n c l u d e <s t d l i b . h>
#i n c l u d e <cuda . h>
#i n c l u d e <cuda r unt i me . h>
g l o b a l voi d s ome ke r ne l ( . . . ) { . . . }
i n t mai n ( voi d ) {
// De c l ar e a l l v a r i a b l e s .
. . .
// Al l o c a t e hos t memory .
. . .
// Dynami cal l y a l l o c a t e de v i c e memory f o r GPU
r e s u l t s .
. . .
// Wr i t e t o hos t memory .
. . .
// Copy hos t memory t o de v i c e memory .
. . .
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 44 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism Markov chain Monte Carlo
Preview: a bare bones CUDA C workow
// Execut e k e r n e l on t he de v i c e .
s ome ker nel <<< num bl ocks , num t he ads pe r bl oc k
>>>(...) ;
// Wr i t e GPU r e s u l t s i n de v i c e memory back t o
hos t memory .
. . .
// Fr ee dynami c al l y a l l o c a t e d hos t memory
. . .
// Fr ee dynami c al l y a l l o c a t e d de v i c e memory
. . .
}
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 45 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism Markov chain Monte Carlo
Outline
A review of GPU parallelism
Examples of parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte Carlo
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 46 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism Markov chain Monte Carlo
Resources
1. J. Sanders and E. Kandrot. CUDA by Example.
Addison-Wesley, 2010.
2. Prof. Jarad Niemis STAT 544 lecture notes.
Will Landau (Iowa State University) A codelsss introduction to GPU parallelism September 23, 2013 47 / 47
A codelsss
introduction to
GPU parallelism
Will Landau
A review of GPU
parallelism
Examples of
parallelism
Vector addition
Pairwise summation
Matrix multiplication
K-means clustering
Markov chain Monte
Carlo
Examples of parallelism Markov chain Monte Carlo
Thats all for today.