Re: Support Vector Machines, Kernel Functions, Data Snooping
If you're sticking with the same learning algorithm, I think you can account for the amount of snooping you're doing by expanding your hypothesis set. For example, if we had the nonlinearly separable case in class where the target function was a circle. We could start with H1, so we have weights for x0, x1, and x2. If we see poor generalization (high Ein or Eval), we can then go to H2, so now we have:
phi = { 1, x1, x2, x1^2, x1*x2, x2^2 }
Since H2 includes H1, these are counted towards the VCdimension and we should be ok. Where we would get in trouble with snooping is doing something like realizing H1 didn't work and going to this:
phi = {1, x1^2, x1*x2, x2^2}
And not charging for using the x1 & x2, even though you already tested using them in a previous run
