LFD Book Forum Questions on Problem 2.24
 Register FAQ Calendar Mark Forums Read

#1
09-30-2012, 11:17 PM
 mileschen Member Join Date: Sep 2012 Posts: 11
Questions on Problem 2.24

Though I have solved this problem, I still a little bit confusing.
(a) Eout. whether it is the test error Etest based on the test data set T, with size N, of a particular hypothesis g that's learnt from a particular training data set D (two points).
(b) Should the bias be computed based on the same test data set T? That is, bias = Ex[bias(x)] = 1/N * sum(bias(xi)) = 1/N * sum((g_x(xi) - f(xi))^2) for each xi in T, where g_x() is the average function.
(c) Should the var be computed based on the K data sets that learn the average function g_(x) and based on the test data set T? That is, var = Ex[var(x)] = 1/N * sum[1/k * sum((gk(xi) - g_x(xi))^2)].

for Eout, bias, and var, should the be computed based on the same test data set?
#2
10-01-2012, 05:10 AM
 magdon RPI Join Date: Aug 2009 Location: Troy, NY, USA. Posts: 595
Re: Questions on Problem 2.24

(a) For this problem if you are given a linear hypothesis it should be possible to analytically compute . However, if you computed it on a test set T, it is fine.

(b) Yes. It is also true that Etest=bias+var. Why? (because we showed this for every x).

(c) The var is computed using the same data sets on which you learned and computed the average function. The average variance is computed over the distribution of the inputs. In the case you a test set, the average is taken over the test set. Just like bias(x), var(x) is also a function of x that captures how variable your prediction is at a point x. You take all your predictions on x learned from different data sets and compute the variance of those (just like you take the average of those to get the average function.

Remember that the only purpose of the test set or the input distribution P(x) is to compute an average over (x) of all these quantities. If you had a single test point as discussed in class, everything works there too.

Quote:
 Originally Posted by mileschen Though I have solved this problem, I still a little bit confusing. (a) Eout. whether it is the test error Etest based on the test data set T, with size N, of a particular hypothesis g that's learnt from a particular training data set D (two points). (b) Should the bias be computed based on the same test data set T? That is, bias = Ex[bias(x)] = 1/N * sum(bias(xi)) = 1/N * sum((g_x(xi) - f(xi))^2) for each xi in T, where g_x() is the average function. (c) Should the var be computed based on the K data sets that learn the average function g_(x) and based on the test data set T? That is, var = Ex[var(x)] = 1/N * sum[1/k * sum((gk(xi) - g_x(xi))^2)]. for Eout, bias, and var, should the be computed based on the same test data set?
__________________
Have faith in probability
#3
10-01-2012, 06:34 AM
 mileschen Member Join Date: Sep 2012 Posts: 11
Re: Questions on Problem 2.24

I still have some questions.
var = Ex[var(x)], but var(x) = Ed[(gk(x) - g_(x))^x], where var(x) is computed based on the K data sets that learnt the average function g_(x). Then, how to compute var, which is a expected value of var(x)?

If var is computed on the same data sets that learnt the average function. Then, how to compute bias = Ex[bias(x)]? If still be computed in the same data set that learnt the average function?
#4
10-01-2012, 07:40 AM
 magdon RPI Join Date: Aug 2009 Location: Troy, NY, USA. Posts: 595
Re: Questions on Problem 2.24

The point x has nothing to do with the data sets on which you learn. Fix any point x.

You can now compute M1=Ed[gk(x)].

You can also compute M2=Ed[gk(x)^2].

M1 and M2 are just two numbers which apply to the point x. Clearly M1 and M2 will change if you change x, so M1 and M2 are functions of x

Now, for example, if you have many x's (eg a test set) you can compute the average of and over those x's. This means you have to compute M1 and M2 for each of those x's. You can use the same learning data sets to do so.

Quote:
 Originally Posted by mileschen I still have some questions. var = Ex[var(x)], but var(x) = Ed[(gk(x) - g_(x))^x], where var(x) is computed based on the K data sets that learnt the average function g_(x). Then, how to compute var, which is a expected value of var(x)? If var is computed on the same data sets that learnt the average function. Then, how to compute bias = Ex[bias(x)]? If still be computed in the same data set that learnt the average function?
__________________
Have faith in probability

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 12:10 PM.