View Single Post
Old 04-16-2012, 07:41 PM
jsarrett jsarrett is offline
Join Date: Apr 2012
Location: Sunland, CA
Posts: 13
Default Re: HW 2.8: Seeking clarification on simulated noise

Here's how I think about E_{in} and E_{out}.

The in-sample performance E_{in} is how well you have converged on a solution to your given data. X in our notation. The given data is a *sample* of the real world domain on which f is defined. Therefore the in-sample performance of h is how well it works on the data set (how much is h(x) \approx y).

The out-of-sample performance E_{out}, is how well h works on the rest of the world (not in our tiny sample). We have seen in several of the problems that we can estimate it by generating a whole new data set (often with many more sample points) and compare the performance of our h with the performance of the made up f.

Of course in a real situation we won't have f, only the knowledge(belief!?) that it exists. That's why we need the Hoeffding inequality, so we can at least bound E_{out}.
Reply With Quote