Chosen #features for CV
On page 152 (4.16b) E_cv is minimized at #features = 5 and 7. You say "the cross validation error is minimized between 5--7 feature dimensions; we take 6 feature dimensions..."
Why 6 and not 5, especially given the discussion about Occam's Razor that follows? 5 features would have same E_in, lower E_cv (and lower E_out but which you could not know in reality) and would be the simpler model. I know that it would make little practical difference but as a general principle shouldn't Occam's razor be favored?
|