LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Chapter 4 - Overfitting (http://book.caltech.edu/bookforum/forumdisplay.php?f=111)
-   -   Problem 4.4: Interpretation (http://book.caltech.edu/bookforum/showthread.php?t=4785)

DeSLex 10-04-2017 11:24 AM

Problem 4.4: Interpretation
Can someone confirm if this interpretation of "experiment" in part (d) is correct? I think we have nested loops:

for each choice of Q sub f, N, and sigma
Define the normalizing constant c sup 2 = E sub a, x (f sup 2).
for each choice of coefficients {a sub q: q = 1,..., Q} from standard normal distributions
f(x) = sum from 1 to Q (a sub q L sub q (x)) / c
for n = 1 to N
y sub n = f(x sub n) + sigma epsilon sub n
Find E sub out (g sub 2) and E sub out (g sub 10)
The reason for all this detail is that I was unclear about what the conditional distribution p(y|x) might be. I think now that in each iteration of the second nested loop, we fix a choice of {a sub q}; given these coefficients, f becomes a deterministic function of x and the only randomness in y sub n is due to epsilon sub n.

As a result, we have a different joint distribution P(x, y) for each choice of {a sub q}. We also have a different target function for each {a sub q}.

The repeated experiments in (d), with fixed Q, N, and sigma, lead to one E sub out (g) for each set of coefficients {a sub q}. E sub out (g) is a function of the normal random vector (a sub 1, ..., a sub Q). The average of the out-of-sample errors is empirical mean of the distribution of E sub out.

All times are GMT -7. The time now is 09:44 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.