Quote:
Originally Posted by Andrs
Thanks for the quick answer.
I would like to check that I really understood your recomendation: I will be consuming all my trainning-data with the cross validation procedure. Through the CV I select the model and the hypothesis (g-) with the corresponding parameters and I get Ecv that is a good estimate of Eout.
Your suggestion is that I could use this model (hypothesis set) and (re)train it on the full trainning-data in order to select a new hypothesis(g+). This new hypothesis(g+) may do better than the hypothesis (g-) but the only safer estimate for Eout is the estimate that I got thru the cross validation(Ecv). The only "problem" here is that now I do not have any data to "test" this new hypothesis (g+).
|
The hypothesis trained on the full data set, denoted by

which you refer to as g+, is indeed the result of this process. To estimate its

, we still use the cross validation estimate for

, notwithstanding the fact that it is a different hypothesis (but close enough) for the reason you outline; we have no cross validation data points left to evaluate

directly.