LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 2 (http://book.caltech.edu/bookforum/forumdisplay.php?f=131)
-   -   Hard to understand what is wanted in HW2, Q6 (http://book.caltech.edu/bookforum/showthread.php?t=4184)

 Katie C. 04-10-2013 12:14 PM

Hard to understand what is wanted in HW2, Q6

I am confused about what to do in Question 6. Reading the previous posts on the forum only made me more confused.

To clarify I am using the following variable names:
N = # of points in the training set
Ntest = # of points in the testing set
Nexp = # Number of runs of the experiment

In question 6, Ntest = 1000 and Nexp = 1000 and we use N=100 of question 5.

Am I correct in assuming that for each of the Nexp runs we:
1. Generate f(x)
2. Generate N training points
3. Estimate g(x)
4. Generate Ntest testing points
5. Evaluate g(x) on the Ntest testing points and record the Eout for that run.

After completing Nexp runs, average the Eout values to get a final estimate for Eout.

*OR* do we only perform steps 1-3 one time and then repeat steps 4 and 5 Nexp times?

 yaser 04-10-2013 12:53 PM

Re: Hard to understand what is wanted in HW2, Q6

Quote:
 Originally Posted by Katie C. (Post 10320) N = # of points in the training set Ntest = # of points in the testing set Nexp = # Number of runs of the experiment In question 6, Ntest = 1000 and Nexp = 1000 and we use N=100 of question 5. Am I correct in assuming that for each of the Nexp runs we: 1. Generate f(x) 2. Generate N training points 3. Estimate g(x) 4. Generate Ntest testing points 5. Evaluate g(x) on the Ntest testing points and record the Eout for that run. After completing Nexp runs, average the Eout values to get a final estimate for Eout.
Hi,

You got it.

 jtsengcr 04-14-2013 02:04 PM

Re: Hard to understand what is wanted in HW2, Q6

Are you sure? My thought for question 5 and 6 was:

1. Generate target function f(x1, x2) only once.
2. Generate a large data set D (x1, x2, y) where y = f(x1, x2).

loop 1000 times
1. Take 100 points from D in space limited for training.
2. Linear regression using the 100 training points for g(x1, x2).
3. Evaluate whether g(x1, x2) = y for the 100 training points, get Ein.
4. Take 1000 points from D in space reserved for testing.
5. Evaluate whether g(x1, x2) = y for the 1000 testing points, get Eout.

Average Ein.
Average Eout.

I did not get the right answer, so, please correct my logic.

 yaser 04-14-2013 03:02 PM

Re: Hard to understand what is wanted in HW2, Q6

Quote:
 Originally Posted by jtsengcr (Post 10395) Are you sure? My thought for question 5 and 6 was: 1. Generate target function f(x1, x2) only once. 2. Generate a large data set D (x1, x2, y) where y = f(x1, x2).
When you do this, the results will depend on which target function you have. By generating a random target function every time, you take out that dependency. Of course you can be lucky in the above scenario and get a target function that is typical rather than odd, so you end up with the same answer. :)

 All times are GMT -7. The time now is 10:50 AM.