Training v Testing set size rules of thumb...
...
I have read elsewhere comments like "for learning it is best to use say 40% of your whole dataset for training, 30% for validation and say 30% for testing". In light of cross-validation using "leave one/many out" technique, is there a rule of thumb for training vs test set size proportions?
Would I be correct in answering as follows: the test set should be larger than the minimum indicated by VC... effectively 10*degrees of freedom (using other rule of thumb)?...maybe after this it is just trial and error as to what to apportion to the test set with the remaining data?
|