View Single Post
Old 03-04-2013, 07:42 AM
alternate alternate is offline
Join Date: Jan 2013
Posts: 14
Default Re: How many support vectors is too many?

Mentioning the VC dimension brings up something I considered briefly.

It's been said that when we make decisions based on seeing the data we should account for all of the options we considered when we're thinking about generalization. For example in the extreme case of data snooping, or in the lesser case where we should account for the fact that cross-validation adds a little bit of contamination.

But what about, say, a "failed" SVM? For example, we try the SVM hypothesis and get back 500 support vectors out of 1000, then decide to change the model because the first one won't generalize.

Realistically, if I then go to a different kernel or a neural network or something else, it doesn't care whether it was run before or after another model, it will produce the same result. But I could also see the interpretation where the SVM model counts as a space I explored in a similar way to tweaking parameters based on data. To what degree is that the case? Presumably there would be a tradeoff of accepting the weak model vs. accepting weaker generalization if any, which I guess could probably be automated, too.
Reply With Quote