Você está na página 1de 7

GPU in DNN

By Abdul Karim

Principal Supervisor: Professor Abdul Sattar


Associate Supervisor: MAHakim Newton
Why do we need specialized hardware for training Deep Neural Networks?

When you train a deep learning model, two main operations are performed:

• Forward Pass  But the real world data has


Both involves matrix multiplication
• Backward Pass normally hundreds and
thousands of
dimensions/parameters.

 This seems to be very simple


task
Motivation

What does it take to reach human-level performance with a machine-learning algorithm?

Huge data set Appropriate ML


algorithm

Problem: My camera should identify each and every scene it sees like human eye
and then

I should train my deep Convolutional Neural Network with millions of images


Some Real world example: Places
http://places.csail.mit.edu/
new scene-centric database called Places

• A repository of 10 million scene photographs, labeled with scene semantic categories.


• Each image is of shape (64, 64, 3) where 3 is for the 3 channels (RGB).

Size of our training data-set

10 million x 12288
A Reasonable Deep Neural Network
http://places.csail.mit.edu/
new scene-centric database called Places

Forward pass
E(W1,W2,W3,W4)

12288x 10 million
10x10 million 10x10 million 10x10 million 2x10 million
1
2

3
10x12288 10 x10 10x10 2x10
Error
W1 W2 W3 W4
12288
10 10 10
A Reasonable Deep Neural Network
http://places.csail.mit.edu/
new scene-centric database called Places

Weight update
Backward pass 𝝏𝑬
𝒘∗𝒊 = 𝒘𝒊 − 𝜶
𝝏𝒘𝒊

12288x 10 million
10x10 million 10x10 million 10x10 million 2x10 million
1
2

3
10x12288 10 x10 10x10 2x10
Error
𝒘∗𝟏 𝒘∗𝟐 𝒘∗𝟑 𝒘∗𝟒
12288
10 10 10
Is there any optimum error or
bayes error which is related to
the number of iteration? How
repeat until converge?

Does the weight update has the


info or any prints of the
previous weights?

Repeat until converge

{
 Forward propagation to calculate the error or cost or objective function.
 Update the weights through backward propagation
}
We will have nearly equal
performance without using
GPU using our algorithm?

Você também pode gostar