View Single Post
  #1  
Old 06-05-2012, 03:47 AM
marcello marcello is offline
Member
 
Join Date: Apr 2012
Posts: 35
Default Cross validation and data snooping

To avoid data snooping, should we better leave out the cross validation subset when we normalize the data?
Cause I guess cv would be affected the same way as test data is, right?

So would it be better, for 10 fold cv, to scale data examining the 9/10 used for training, and the use the same scaling for the 1/10 left out for cv?
Would the results be comparable in that case, having 10 different scaling for the 10 different split?
Reply With Quote