I have some unclear idea about the Exercise 5.4. The exercise itself states:
Quote:
By looking at the data, it appears that the data is linearly separable

Here I interpret "looking at the data" as "looking at the whole data set including the training set and test set", is my interpretation correct? If yes, then we can't determine the dvc of both training set and test set (due to the fact it's hard for us to know how many hypotheses we have looked at to come at the conclusion "the data is linearly separable"), right?
The exercise also states:
Quote:
We now wish to make some generalization conclusions, so we look up the dvc for our learning model and see that it is d+1. Therefore, we use this value of dvc to get a bound on the test error.

However, this statement confuses me. Assuming that we have never snooped on the data set, then the dvc = d + 1 should not be applied on the VC bound of the test error as the hypothesis set we use on test set has only one hypothesis  and that is the final hypothesis that has come out from the learning algorithm used on the training set, right?
Or is this also a point that I need to give it to my answer to the exercise?
If that is the case then however the exercise also asks:
Quote:
(b) Do we know the dvc for the learning model that we actually used? It is this dvc that we need to use in the bound.

Then which bound is refered here? The bound on the training set or the bound on the test set? If it's the bound on the test set then we should not use the dvc of the bound on the training set, right?
Thank you very much in advance.