In my textbook, there is a statement mentioned on the topic of linear regression/machine learning, and a question, which is simply quoted as,

Consider a noisy target,

, for generating the data, where

is a noise term with zero mean and

variance, independently generated for every example

. The expected error of the best possible linear fit to this target is thus

.

For the data

, denote the noise in

as

, and let

; assume that

is invertible. By following the steps below, ***show that the expected in-sample error of linear regression with respect to

is given by***,

Below is my methodology,

Book says that,

In-sample error vector,

, can be expressed as

, which is simply, hat matrix,

, times, error vector,

.

So, I calculated in-sample error,

, as,

Since it is given by the book that,

, and also

is symetric,

I got the following simplified expression,

Here, I see that,

And, also, the sum formed by

, gives the following sum,

I undestand that,

However, I don't understand why,

should be equal to zero in order to satisfy the equation,

***Can any one mind to explain me why

leads to a zero result ?***