Quote:
Originally Posted by jlaurentum
I've been kind of saving this question, but decided to ask at this point.
Why is there no mention of residual analysis in any of the linear regression topics the course has covered? How does residual analysis fit into the data learning picture (if it fits in at all)?
Specifically: starting with this week's topic of regularization, we've seen how weight decay softens the weights, but in doing so, chages them from the normal weights you'd obtain in linear regression. I would imagine that with weight decay, it would no longer hold that the mean of the errors (as in linear regression errors: ) is equal to zero, so the residuals would not be normally distributed with same variance and zero mean. In other words, with weight decay at least one of the GaussMarkov assumptions do not hold?
Does that matter?
In general, are the standard tools of linear regression analysis we were taught in school (looking at the determination coefficient, hypothesis testing on the significance of the coefficients, and residual analysis to see if the assumptions that back up the previous elements hold) entirely pointless when you're doing machine learning?

Residual analysis and other details of linear regression are worthy topics. They are regularly covered in statistics, but often not covered in machine learning. If you recall in Lecture 1, we alluded quickly to the contrast between statistics and machine learning (which do have a substantive overlap) in terms of mathematical assumptions and level of detailed analysis. Linear regression is a case in point for that contrast.