Interesting problem. First, as you anticipated, just specifying the input as

lagging returns would not determine the VC dimension. If you further specify that you are doing linear classification for example, then that makes the VC dimension equal

. If you use another model, it may be a different VC dimension.

Now with the two models (both are similarity-based models), I assume that the forecast is binary. You have grouped inputs into 32 categories, with all inputs in the same category necessarily mapping to the same

label. If you have 33 input vectors, two of them must have the same sign pattern and therefore necessarily map to the same label, so 33 is a break point and indeed the VC dimension is 32. In the RBF case, if the clustering is done in an unsupervised way, then the VC dimension would be the number of parameters in the second layer, which is also 32 in this case by choice.

The equality of number of centers and number of support vectors in Lecture 16 was a forced assumption for comparison, but they need not be equal. The number of support vectors comes out of the process of solving the SVM kernel problem, whereas the number of clusters is a parameter under our control that we decide on before running Lloyd's.