Escolar Documentos
Profissional Documentos
Cultura Documentos
0/23
Linear Regression
Roadmap
1
1/23
Linear Regression
age
gender
annual salary
year in residence
year in job
current debt
training examples
D : (x1 , y1 ), , (xN , yN )
23 years
female
33,000 USD
1 year
0.5 year
20,000
learning
algorithm
A
final hypothesis
gf
(learned formula to be used)
hypothesis set
H
(set of candidate formula)
Y = R: regression
2/23
Linear Regression
23 years
$ 33.000
0.5 year
20,000
d
X
wi xi
i=0
3/23
Linear Regression
x = (x1 , x2 ) R2
x1
x2
linear regression:
find lines/hyperplanes with small residuals
4/23
Linear Regression
in-sample
out-of-sample
N
1X
(h(xn ) yn )2
Ein (hw) =
| {z }
N
n=1
Eout (w) =
E
(x,y )P
(wT x y )2
wT xn
5/23
Linear Regression
Fun Time
Consider using linear regression hypothesis h(x) = wT x to
predict the credit limit of customers x. Which feature below shall
have a positive weight in a good hypothesis for the task?
1
birth month
monthly income
current debt
Reference Answer: 2
Customers with higher monthly income should
naturally be given a higher credit limit, which is
captured by the positive weight on the monthly
income feature.
6/23
Linear Regression
N
N
1X T
1X T
2
(w xn yn ) =
(xn w yn )2
N
N
n=1
n=1
2
T
x1 w y1
T
1
x
w
y
2
2
...
N
xT w y
N
N
xT1
T
1
x2 w
...
N
xT
N
1
k X
w y k2
N |{z} |{z} |{z}
Nd+1 d+11
2
y1
y2
...
yN
N1
7/23
Linear Regression
1
kXw yk2
N
Ein
Ein (w)
Ein
w 0 (w)
Ein
w 1 (w)
...
Ein
w d (w)
0
0
...
0
8/23
Linear Regression
Ein (w) =
1
1
kXw yk2 = wT XT X w 2wT XT y + yT y
|{z}
| {z }
|{z}
N
N
b
vector w
one w only
Ein (w)= N1
aw 2bw + c
Ein (w)= N1 wT Aw 2wT b + c
simple! :-)
2
N
XT Xw XT y
9/23
Linear Regression
2
N
invertible XT X
easy! unique solution
wLIN =
1
XT y
XT X
|
{z
}
pseudo-inverse X
XT Xw XT y = Ein (w) = 0
singular XT X
many optimal solutions
one of the solutions
wLIN = X y
by defining X in other ways
N d +1
practical suggestion:
use well-implemented routine
1 T
instead of XT X
X
for numerical stability when almost-singular
10/23
Linear Regression
xT1
y1
xT
2
y = y2
X=
T
yN
xN
| {z }
|
{z
}
N(d+1)
calculate pseudo-inverse
N1
X
|{z}
(d+1)N
3
return wLIN = X y
|{z}
(d+1)1
Linear Regression
Fun Time
After getting wLIN , we can calculate the predictions yn = wTLIN xn . If all yn
similar to how we form y, what is the matrix
are collected in a vector y
?
formula of y
1
XXT y
XX y
XX XXT y
Reference Answer: 3
= XwLIN . Then, a simple
Note that y
substitution of wLIN reveals the answer.
12/23
Linear Regression
Linear Classification
Y = {1, +1}
T
Y = R
h(x) = sign(w x)
err(y , y ) = Jy 6= y K
h(x) = wT x
err(y , y ) = (y y )2
Linear Regression
2
errsqr = wT x y
desired y = 1
desired y = 1
squared
0/1
err
err
wTx
wTx
err0/1 errsqr
20/23
Linear Regression
VC
21/23
Linear Regression
Summary
1
2