Gradient Descent Vs Normal Equation

Gradient Descent vs Normal Equation
Gradient Descent
In Gradient Descent algorithm, in order to minimize the cost function J(), we take this iterative
algorithm that takes many steps, multiple iterations of gradient descent to converge to the local
minimum.
Results Graph theta calculation using Gradient Descent

1. Disadvantage: Need to choose the learning rate
This means that you need to run the algorithm for a few times with different learning rate
, and choose a sufficiently small value of to make sure the cost function J() decreases
after every iteration.
2. Disadvantage: Needs many iterations to reach convergence
3. Advantage: Works well even when n is very large.

Normal Equation
In contrast, the normal equation would give us a method to solve for analytically, so that rather
than needing to run the iterative algorithm in gradient descent, we can instead just solve for the
optimal value for all at one go.
Results Graph theta calculation using Normal Equation

1. Advantage: No need to choose the learning rate
2. Advantage: Dont need to iterate to reach convergence
3. Disadvantage: Need to compute
, which is slow (O(n3)) if n is too large, e.g. n > 10,000

Gradient Descent Vs Normal Equation

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Gradient Descent Vs Normal Equation

Enviado por

Direitos autorais:

Formatos disponíveis

Gradient Descent vs Normal Equation

Results Graph theta calculation using Gradient Descent

3. Advantage: Works well even when n is very large.

Results Graph theta calculation using Normal Equation

Você também pode gostar