LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 5

Reply
 
Thread Tools Display Modes
  #1  
Old 05-07-2013, 09:23 AM
Humble Humble is offline
Junior Member
 
Join Date: Apr 2013
Posts: 4
Default Question 1

What does the outside expected value E[E(Wlin)] value mean in words.
Reply With Quote
  #2  
Old 05-07-2013, 10:29 AM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: Question 1

Quote:
Originally Posted by Humble View Post
What does the outside expected value E[E(Wlin)] value mean in words.
First, just to make sure, the inside 'E' is not an expectation, but the value of the in-sample error that corresponds to the weight vector {\bf w}_{\rm lin}. The (outside) expected value is with respect to the training data set, and it means the average value (of the in-sample error) as you train with different data sets.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #3  
Old 07-15-2013, 06:30 AM
hsolo hsolo is offline
Member
 
Join Date: Jul 2013
Posts: 12
Default Re: Question 1

Quote:
Originally Posted by yaser View Post
First, just to make sure, the inside 'E' is not an expectation, but the value of the in-sample error that corresponds to the weight vector {\bf w}_{\rm lin}. The (outside) expected value is with respect to the training data set, and it means the average value (of the in-sample error) as you train with different data sets.

Training data has d dimensions in the x's. If one ignored some of the dimensions and did linear regression with reduced number d' of dimensions one would have larger in-sample errors presumably, compared to considering all d dimensions?

Why then is the expected in-sample error averaged over all data sets increasing with the number of dimensions?
Reply With Quote
  #4  
Old 07-15-2013, 01:38 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: Question 1

Quote:
Originally Posted by hsolo View Post
Training data has d dimensions in the x's. If one ignored some of the dimensions and did linear regression with reduced number d' of dimensions one would have larger in-sample errors presumably, compared to considering all d dimensions?

Why then is the expected in-sample error averaged over all data sets increasing with the number of dimensions?
To answer the first question, if you choose to omit some of the input variables, you will indeed get a larger (at least not smaller) in-sample error. Not sure I understand the second question, but having different training sets does not change the number of input variables. It is a hypothetical situation where you assume the availability of different data sets on the same variables.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #5  
Old 07-15-2013, 10:35 PM
hsolo hsolo is offline
Member
 
Join Date: Jul 2013
Posts: 12
Default Re: Question 1

Quote:
Originally Posted by yaser View Post
To answer the first question, if you choose to omit some of the input variables, you will indeed get a larger (at least not smaller) in-sample error. Not sure I understand the second question, but having different training sets does not change the number of input variables. It is a hypothetical situation where you assume the availability of different data sets on the same variables.
My bad for the second question -- I had a typo in my handwritten expression for the expectation. The correct expression does have expected in-sample error decreasing as d is increasing.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 07:35 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.