LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 7

Reply
 
Thread Tools Display Modes
  #1  
Old 05-18-2013, 01:20 PM
Kekeli Kekeli is offline
Junior Member
 
Join Date: Apr 2013
Posts: 6
Default Questions 1-4: Clarification

Sorry to ask about what may be obvious to all:

No regularization, just linear regression with transformed input variables...
So the models k=3..7 correspond to

\phi_3 = 1 + x_1 + x_2 + x_1^2

\phi_4 = 1 + x_1 + x_2 + x_1^2 + x_2^2
etc.?
Reply With Quote
  #2  
Old 05-18-2013, 01:42 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,476
Default Re: Questions 1-4: Clarification

Quote:
Originally Posted by Kekeli View Post
So the models k=3..7 correspond to

\phi_3 = 1 + x_1 + x_2 + x_1^2

\phi_4 = 1 + x_1 + x_2 + x_1^2 + x_2^2
etc.?
The transformed space in the case of k=3 would be

{\bf z}=(\phi_0({\bf x}),\phi_1({\bf x}),\phi_2({\bf x}),\phi_3({\bf x}))=(1,x_1,x_2,x_1^2)

The model (hypothesis set) would be a linear combination of these coordinates,

h({\bf x})=\tilde{w}_0 \phi_0({\bf x})+\tilde{w}_1 \phi_1({\bf x})+\tilde{w}_2\phi_2({\bf x})+\tilde{w}_3\phi_3({\bf x})
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #3  
Old 05-18-2013, 02:09 PM
Kekeli Kekeli is offline
Junior Member
 
Join Date: Apr 2013
Posts: 6
Default Re: Questions 1-4: Clarification

thank you!
to beat a dead horse:
for Q1&Q2, for each model, train with the 1st 25 points and, using the weights, eval Ein with the last 10 points, and Eout with the out.dta points...
for Q3&Q4, rinse/repeat with a different split for training and validation data

[p.s., really enjoying the course, and appreciate your time and consideration in the forum!]
Reply With Quote
  #4  
Old 05-18-2013, 06:53 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,476
Default Re: Questions 1-4: Clarification

Quote:
Originally Posted by Kekeli View Post
for Q1&Q2, for each model, train with the 1st 25 points and, using the weights, eval Ein with the last 10 points, and Eout with the out.dta points
Correct, except that it is E_{\rm val} with the last 10 points.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #5  
Old 05-20-2013, 07:01 AM
jlaurentum jlaurentum is offline
Member
 
Join Date: Apr 2013
Location: Venezuela
Posts: 41
Default Re: Questions 1-4: Clarification

Hello:

I'm having some doubts with these questions as well. For questions 1 and 2, I trained on the first 25 samples of in.dta, validated on the last 10 samples of the same data file, and finally evaluated Eout for each of the five models with the 250 samples in out.dta.

The five models are linear regression models with 4 to 7 weigths using the first 4 through all seven nonlinear transformations explained in the question. As before (with homework 6), once a linear model predicts a certain Y value, the sign operation is taken on this sum Y=\sum_{i=1}^k w_i\cdot \phi_i(x_n) to see how each point is classified. The classification error is simply the ration of misclassified points in the entire test/validation data set.

I know I must be doing something wrong because when I see question 5, my Eout's are much higher than any of the options given. After all, when you have a 4 to 7 parameter model and you're only training with 10 or 25 samples, you'd expect an outrageous Eout. ... What is the mistake in my procedure?
Reply With Quote
  #6  
Old 05-20-2013, 01:00 PM
jlaurentum jlaurentum is offline
Member
 
Join Date: Apr 2013
Location: Venezuela
Posts: 41
Default Re: Questions 1-4: Clarification

Please disregard my previous post.

I made a stupid programming error in my code. As I was generating the predictions for each model on the test/validation data sets, I was comparing that output to the wrong "Y" variable. Once corrected, the out of sample errors of questions 2 and 4 are within the options given in question 5.

To those who may be struggling with these questions,
  • Remeber that in questions 1 and 2 you train with the first 25 samples of "in.dta" and validate with the last 10 samples. For questions 3 and 4 you do the opposite: train with the last 10 samples and validate with the first 25.
  • When you calculate the validation and test set errors, remember to compare each model's output with the correct Y's.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 07:32 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.