Normal equation in linear regression
From coursera's ML course I've known that the normal equation is calculated as follows:
pinv((X'*X))*X'*Y; (octave code) but apparently this is equivalent to just pinv(X)*Y;
Can anyone explain why this is the case?
