Default Discussion of Lecture 13 "Validation"

Question: (Slide 7/22) The rule of thumb is N/5. Why do we have N/10 on the last slide?

Answer: On the last slide we use cross-validation, on slide 7/22 we do validation just once, so it is a different game. It is clear, that in cross-validation we should use less points for validation, because in any case we repeat the process and finally end up using all the points, so we better increase N-K. The question is, how much smaller K should we take. We might take K=1 (as in leave one out). Of course, if the dataset is large, it would take too much time. Actually, this is not the only reason. It looks counterintuitive, but in some situations, leave one out cross-validation error has stronger fluctuations, so you would use 10-fold cross-validation even if you don't worry about the computation resources, e.g. if you have just 100 examples, 10-fold validation may give you better stability then leave one out.
