LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Chapter 4 - Overfitting (http://book.caltech.edu/bookforum/forumdisplay.php?f=111)
-   -   What happens when error cannot be computed (is infinite) with leave-one-out CV? (http://book.caltech.edu/bookforum/showthread.php?t=1109)

tadworthington 08-22-2012 01:16 PM

What happens when error cannot be computed (is infinite) with leave-one-out CV?
 
I thought of this while working on the homework for the class. Let's say I have three points: (-1,0), (1,0), and (1,1). I want to use a linear model (h(x) = mx + b) to do the fitting, and I use LOO to check my cross validation error. The problem becomes apparent right away:

Code:

Leave out (-1,0), and fit (1,0), (1,1).  Fitting gives a vertical line, x = 1.
Of course, I am now unable to compute the squared error for the point (-1,0) that was left out - the error will be infinite.

Is the solution that I can't choose a vertical line (x = k, for some k) when fitting the data?

tadworthington 08-22-2012 01:18 PM

Re: What happens when error cannot be computed (is infinite) with leave-one-out CV?
 
I think my question didn't make sense. Of course I can't get a vertical line when producing a hypothesis of the form h(x) = mx + b.

Too bad there is no delete on the forum :o

magdon 08-22-2012 04:57 PM

Re: What happens when error cannot be computed (is infinite) with leave-one-out CV?
 
As posed, the LOO error is indeed not defined (infinite) however, you question is interesting when your last data point is (say) (1+\epsilon,1).

By choosing \epsilon appropriately small, you can make the LOO error arbitrarily large.

However, there is no problem with that; remember that your LOO error is an estimate of your E_{out} when learning with N-1 points. If your distribution can generate the two points (1+\epsilon,1) and (1,0) with high probability (which is verified by the very existence of this data set) then indeed, the out-of-sample error you should expect when learning from 2 data points is very large.

Quote:

Originally Posted by tadworthington (Post 4284)
I thought of this while working on the homework for the class. Let's say I have three points: (-1,0), (1,0), and (1,1). I want to use a linear model (h(x) = mx + b) to do the fitting, and I use LOO to check my cross validation error. The problem becomes apparent right away:

Code:

Leave out (-1,0), and fit (1,0), (1,1).  Fitting gives a vertical line, x = 1.
Of course, I am now unable to compute the squared error for the point (-1,0) that was left out - the error will be infinite.

Is the solution that I can't choose a vertical line (x = k, for some k) when fitting the data?



All times are GMT -7. The time now is 03:10 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.