You are right, scaling can be any transformation. If you used some transformation to learn on the training data, you must use the
same transformation when you test. Here is a simple idealized setting with your log transform and with simple scaling. Suppose the problem is 1dim regression:
x: 2,4,6.
y: 6,12,18.
xtest=8
ytest=24
It is easy to see the relationship is y=3x. We can succesfully learn this from the training data. Now suppose we rescaled the xdata by 0.5 in the training:
x=1,2,3
y=6,12,18
What is the relationship you would learn:
y=6x
Now try to apply this to the test data:
, because you did not rescale the test data in exactly the way you did the training data. If you also rescale the test datum, then xtest becomes 4 and indeed the function you learned works: ytest=6 xtest'.
Lets see what happens with the log transform: the "rescaled", i.e. transformed xdata become:
x=log2,log4,log6
y=6,12,18
What is the relationship you would learn:
If you simply apply this to the test point it will fail:
. You must first transform the test point to xtest'=log8. Now it is indeed the case that your learned function will work:
The thing to realize is that when you rescale the training data
and then learn, the learning
takes into account the scaling and the hypothesis learned will depend on what scaling is used as the examples above illustrate. In other words, the hypothesis works for
any data point (training or test) only after the scaling is applied.
Quote:
Originally Posted by scottedwards2000
Thanks, Dr. MagdonIsmail for the example. However, I'm still not sure I understand exactly why we must use the same rescaling parameters for the training data. I guess I could see that if we were doing a simple log transform (e.g. if you used base10 on training data you certainly wouldn't want to use base2 on the test date), but in your example you are transforming the data to fit a certain criteria (avg sq value of each coordinate = 1). Would your model then expect a new data set to have the same quality? If we apply the exact rescaling parameters that we used on the training set to the test set, it certainly won't meet that criteria. Thanks for your help!
