Quote:
Originally Posted by Andrs
Thanks for the quick answer.
I would like to check that I really understood your recomendation: I will be consuming all my trainningdata with the cross validation procedure. Through the CV I select the model and the hypothesis (g) with the corresponding parameters and I get Ecv that is a good estimate of Eout.
Your suggestion is that I could use this model (hypothesis set) and (re)train it on the full trainningdata in order to select a new hypothesis(g+). This new hypothesis(g+) may do better than the hypothesis (g) but the only safer estimate for Eout is the estimate that I got thru the cross validation(Ecv). The only "problem" here is that now I do not have any data to "test" this new hypothesis (g+).

The hypothesis trained on the full data set, denoted by
which you refer to as g+, is indeed the result of this process. To estimate its
, we still use the cross validation estimate for
, notwithstanding the fact that it is a different hypothesis (but close enough) for the reason you outline; we have no cross validation data points left to evaluate
directly.