
06-05-2012, 04:28 AM
|
 |
RPI
|
|
Join Date: Aug 2009
Location: Troy, NY, USA.
Posts: 597
|
|
Re: Cross validation and data snooping
Yes. Otherwise you have data snooped.
Quote:
Originally Posted by marcello
To avoid data snooping, should we better leave out the cross validation subset when we normalize the data?
Cause I guess cv would be affected the same way as test data is, right?
So would it be better, for 10 fold cv, to scale data examining the 9/10 used for training, and the use the same scaling for the 1/10 left out for cv?
Would the results be comparable in that case, having 10 different scaling for the 10 different split?
|
__________________
Have faith in probability
|