View Single Post
Old 06-05-2012, 04:28 AM
magdon's Avatar
magdon magdon is offline
Join Date: Aug 2009
Location: Troy, NY, USA.
Posts: 597
Default Re: Cross validation and data snooping

Yes. Otherwise you have data snooped.

Originally Posted by marcello View Post
To avoid data snooping, should we better leave out the cross validation subset when we normalize the data?
Cause I guess cv would be affected the same way as test data is, right?

So would it be better, for 10 fold cv, to scale data examining the 9/10 used for training, and the use the same scaling for the 1/10 left out for cv?
Would the results be comparable in that case, having 10 different scaling for the 10 different split?
Have faith in probability
Reply With Quote