View Single Post
  #4  
Old 01-21-2013, 10:15 AM
cygnids cygnids is offline
Member
 
Join Date: Jan 2013
Posts: 11
Default Re: When to use normalization?

Curiosity, Thanks for asking this question, and Prof Magdon, for his reply. A while ago, similar thoughts crossed my mind too.

- When we talk about normalization, are we talking about about getting rid of the "units" of the data? For eg., if the input vector has weight & height features, do we scale to effectively get rid of the kg and meter units, by say the average of weight and heights respectively (or some constant)? Is this what you mean by getting them on a equal footing?

- You caution "scaling" in a relatively sense. Generally speaking, is this to suggest that a cavalier application of simple normalization can distort the correlative structure implicit in the (original) input data?

- Your use of the word scaling raise another question in my mind. Does it make sense to keep an eye on whether the features of input data have disparate "ranges"? Say, one feature ranges from [0,1], and another from [1,1000]. Does it make sense to reduce the "range" of the later to make it comparable to the range of the other feature?

_ I tried to think through your comments in the context of supervised vs unsupervised learning. In a regression situation, we have LHS & RHS, and I suppose one could possibly be more cavalier about normalization, as long as it is done consistently across the system. However, for unsupervised learning, my immediate thoughts are that one needs to a lot more careful about relatively scaling between features. Roughly speaking, is my suspicion right?

-Related to these questions is a nagging concern whether one unduly gives insignificant features importance by bringing them on a "equal footing"?

Thank you for your comments & thoughts.
__________________
The whole is simpler than the sum of its parts. - Gibbs
Reply With Quote