In my textbook, there is a statement mentioned on the topic of linear regression/machine learning, and a question, which is simply quoted as,
Consider a noisy target,

, for generating the data, where

is a noise term with zero mean and

variance, independently generated for every example

. The expected error of the best possible linear fit to this target is thus

.
For the data

, denote the noise in

as

, and let
![\mathbf{\epsilon} = [\epsilon_1, \epsilon_2, ...\epsilon_N]^T \mathbf{\epsilon} = [\epsilon_1, \epsilon_2, ...\epsilon_N]^T](/vblatex/img/10a921b12183603706544dcd2c4b2c87-1.gif)
; assume that

is invertible. By following the steps below, ***show that the expected in-sample error of linear regression with respect to

is given by***,
Below is my methodology,
Book says that,
In-sample error vector,

, can be expressed as

, which is simply, hat matrix,

, times, error vector,

.
So, I calculated in-sample error,

, as,
Since it is given by the book that,

, and also

is symetric,
I got the following simplified expression,
Here, I see that,
And, also, the sum formed by

, gives the following sum,
I undestand that,
However, I don't understand why,

should be equal to zero in order to satisfy the equation,
***Can any one mind to explain me why

leads to a zero result ?***