Re: Which kernel to use?

The caveat is that considering additional kernels increases the complexity of \mathcal H and thus requires larger data sets to mitigate the risk of overfitting.
I suspect that selection of a kernel, without snooping in the data, is more art than science, but may be guided by one's understanding (read intuition) of the expected characteristics of the data.
So, one strategy could be to think in terms of a suitable nonlinear transformation (that would match the data) and then find a kernel matching that transformation. One of the great benefits with SVM is that that you never visit the feature space, you just exploit it via the kernel space (kernel trick).
