LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Chapter 4 - Overfitting (http://book.caltech.edu/bookforum/forumdisplay.php?f=111)
-   -   What happens when error cannot be computed (is infinite) with leave-one-out CV? (http://book.caltech.edu/bookforum/showthread.php?t=1109)

What happens when error cannot be computed (is infinite) with leave-one-out CV?

I thought of this while working on the homework for the class. Let's say I have three points: (-1,0), (1,0), and (1,1). I want to use a linear model (h(x) = mx + b) to do the fitting, and I use LOO to check my cross validation error. The problem becomes apparent right away:

Code:

Leave out (-1,0), and fit (1,0), (1,1).  Fitting gives a vertical line, x = 1.
Of course, I am now unable to compute the squared error for the point (-1,0) that was left out - the error will be infinite.

Is the solution that I can't choose a vertical line (x = k, for some k) when fitting the data?

Re: What happens when error cannot be computed (is infinite) with leave-one-out CV?

I think my question didn't make sense. Of course I can't get a vertical line when producing a hypothesis of the form h(x) = mx + b.

Too bad there is no delete on the forum :o

 magdon 08-22-2012 04:57 PM

Re: What happens when error cannot be computed (is infinite) with leave-one-out CV?

As posed, the LOO error is indeed not defined (infinite) however, you question is interesting when your last data point is (say) .

By choosing appropriately small, you can make the LOO error arbitrarily large.

However, there is no problem with that; remember that your LOO error is an estimate of your when learning with N-1 points. If your distribution can generate the two points and with high probability (which is verified by the very existence of this data set) then indeed, the out-of-sample error you should expect when learning from 2 data points is very large.

Quote:
 Originally Posted by tadworthington (Post 4284) I thought of this while working on the homework for the class. Let's say I have three points: (-1,0), (1,0), and (1,1). I want to use a linear model (h(x) = mx + b) to do the fitting, and I use LOO to check my cross validation error. The problem becomes apparent right away: Code: Leave out (-1,0), and fit (1,0), (1,1).  Fitting gives a vertical line, x = 1. Of course, I am now unable to compute the squared error for the point (-1,0) that was left out - the error will be infinite. Is the solution that I can't choose a vertical line (x = k, for some k) when fitting the data?

 All times are GMT -7. The time now is 03:32 PM.