View Single Post
Old 06-05-2012, 08:51 AM
marcello marcello is offline
Join Date: Apr 2012
Posts: 35
Default Re: Cross validation and data snooping

Thanks for the answers.

So suppose you are dealing with a classification problem and you're planning to use SVM with rbf kernels: your best shot would be not normalizing the data at all?

If I got it right when, instead, you do normalize the data (just the X, of course), you'd better scale every "fold" separately leaving out the cv set, but you are somehow changing the problem.

Another question comes into my mind: could aggregation be used to overcome this problem? Maybe, to prevent being excessivly misleaded, you could use SVM with scaled data, SVM with the original input and another classifier, and choose the answer that gets more votes?

Thanks again
Reply With Quote