Você está na página 1de 2

Gradient Descent vs Normal Equation

Gradient Descent
In Gradient Descent algorithm, in order to minimize the cost function J(), we take this iterative
algorithm that takes many steps, multiple iterations of gradient descent to converge to the local
minimum.

Results Graph theta calculation using Gradient Descent


1. Disadvantage: Need to choose the learning rate
This means that you need to run the algorithm for a few times with different learning rate
, and choose a sufficiently small value of to make sure the cost function J() decreases
after every iteration.
2. Disadvantage: Needs many iterations to reach convergence

3. Advantage: Works well even when n is very large.


Normal Equation
In contrast, the normal equation would give us a method to solve for analytically, so that rather
than needing to run the iterative algorithm in gradient descent, we can instead just solve for the
optimal value for all at one go.

Results Graph theta calculation using Normal Equation


1. Advantage: No need to choose the learning rate
2. Advantage: Dont need to iterate to reach convergence
3. Disadvantage: Need to compute
, which is slow (O(n3)) if n is too large, e.g. n > 10,000

Você também pode gostar