LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 4 (http://book.caltech.edu/bookforum/forumdisplay.php?f=133)
-   -   The role of noise (http://book.caltech.edu/bookforum/showthread.php?t=4216)

matthijs 04-17-2013 08:31 AM

The role of noise
 
I'm having trouble understanding the role of noise. The generalization bound depends on N, the VC dimension of H, and delta.

I notice that in later lecture slides, noise forms an explicit term in the bias-variance decomposition, i.e. more noise increases the expected E_out (apologies for referring to slides that haven't been discussed yet).

Why doesn't it feature in the generalization bound? Is it because it is captured in the E_in term, i.e. more noise will increase our training error? In earlier lectures, N was written in terms of the growth function, to see how much data we need; and a rule of thumb was given that says N >= 10*VCdim. I'd like understand quantitatively how our need for data grows with noise, but I don't see how to do this using the generalization bound or bias-variance.

yaser 04-17-2013 11:21 AM

Re: The role of noise
 
Quote:

Originally Posted by matthijs (Post 10455)
I'm having trouble understanding the role of noise. The generalization bound depends on N, the VC dimension of H, and delta.

I notice that in later lecture slides, noise forms an explicit term in the bias-variance decomposition, i.e. more noise increases the expected E_out (apologies for referring to slides that haven't been discussed yet).

Why doesn't it feature in the generalization bound? Is it because it is captured in the E_in term, i.e. more noise will increase our training error?

Your understanding is correct. Noise increases both E_{\rm in} and E_{\rm out}. Generalization error is the difference between the two. The more critical impact of noise, that of overfitting, will be discussed in Lecture 11.


All times are GMT -7. The time now is 10:22 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.