LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 5 (http://book.caltech.edu/bookforum/forumdisplay.php?f=134)

 Humble 05-07-2013 09:23 AM

Question 1

What does the outside expected value E[E(Wlin)] value mean in words.

 yaser 05-07-2013 10:29 AM

Re: Question 1

Quote:
 Originally Posted by Humble (Post 10741) What does the outside expected value E[E(Wlin)] value mean in words.
First, just to make sure, the inside 'E' is not an expectation, but the value of the in-sample error that corresponds to the weight vector . The (outside) expected value is with respect to the training data set, and it means the average value (of the in-sample error) as you train with different data sets.

 hsolo 07-15-2013 06:30 AM

Re: Question 1

Quote:
 Originally Posted by yaser (Post 10743) First, just to make sure, the inside 'E' is not an expectation, but the value of the in-sample error that corresponds to the weight vector . The (outside) expected value is with respect to the training data set, and it means the average value (of the in-sample error) as you train with different data sets.

Training data has d dimensions in the x's. If one ignored some of the dimensions and did linear regression with reduced number d' of dimensions one would have larger in-sample errors presumably, compared to considering all d dimensions?

Why then is the expected in-sample error averaged over all data sets increasing with the number of dimensions?

 yaser 07-15-2013 01:38 PM

Re: Question 1

Quote:
 Originally Posted by hsolo (Post 11263) Training data has d dimensions in the x's. If one ignored some of the dimensions and did linear regression with reduced number d' of dimensions one would have larger in-sample errors presumably, compared to considering all d dimensions? Why then is the expected in-sample error averaged over all data sets increasing with the number of dimensions?
To answer the first question, if you choose to omit some of the input variables, you will indeed get a larger (at least not smaller) in-sample error. Not sure I understand the second question, but having different training sets does not change the number of input variables. It is a hypothetical situation where you assume the availability of different data sets on the same variables.

 hsolo 07-15-2013 10:35 PM

Re: Question 1

Quote:
 Originally Posted by yaser (Post 11266) To answer the first question, if you choose to omit some of the input variables, you will indeed get a larger (at least not smaller) in-sample error. Not sure I understand the second question, but having different training sets does not change the number of input variables. It is a hypothetical situation where you assume the availability of different data sets on the same variables.
My bad for the second question -- I had a typo in my handwritten expression for the expectation. The correct expression does have expected in-sample error decreasing as d is increasing.

 All times are GMT -7. The time now is 04:35 PM.