Data snooping and non-linear transformation
What is the best way to do non-linear transformation but avoid data snooping? The circle example in lecture 18 was a clear case of data snooping. In practice, what information can we safely use to identify candidate non-linear transforms? Should we just try a few (e.g. second order, third order polynomials and pick the best (of course, not sequentially & using information from one transform to influence the next candidate, since that will be data affecting the learning process)).
|