Re: Machine Learning and census models
How about this approach:
1. Don't look at the data! If you have looked at the data, find a machine learning expert who has not looked at the data and ask him to do it for you. [Unless you have some method of forgetting what you have seen, that is.]
2. Pick a learning method suited to the size of the data set and use leave-one-out cross-validation to find the optimal hypothesis.
The interesting question is what learning method in the second part. Something pretty general and regularized.
3. Bear in mind that extrapolation of non-stationary processes is not necessarily possible (the cross validation has an easy time of it, because most of the data points are internal).
|