Linear Regression - Statistics vs Data Mining
Hi Professor Yaser/Everyone-
A question about looking at regression from a stats vs data mining angle.
Stats - checks for correlated variables, normality of residuals/variables (non-linear transformations probably take care of this), homoscedasticity etc.
Data Mining - as you had mentioned, we want to keep it general.
Does that mean -
a) we don't care about these assumptions or we do care, but they come into play later on.
b) we are at a higher risk for getting misleading results.
It would be nice to have your thoughts on this.
Thanks,
Kartik
|