Re: Training v Testing set size rules of thumb...

If I have understood correctly, once you start using cross validation model you only need to partition your data into 2 (as opposed to a training/validation/testing set model i.e., you partition your data into 3 sets). One set to be used for both training and cross validation, the other set for testing. The "test" set being the set you lock away and don't look at until you are decided on the best hypothesis to use i.e., to see how well the model generalises to independent data. I was wondering what % you should allocate to each of these two sets?

When you wrote:
For single-shot validation, I've seen 5% up to 40% reserved for validation

I assume your meaning of "validation" set is synonymous with "test" set since cross validation is already in place?
