LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 4 (http://book.caltech.edu/bookforum/forumdisplay.php?f=133)
-   -   Why would variance be non-zero? (http://book.caltech.edu/bookforum/showthread.php?t=936)

samirbajaj 08-01-2012 05:41 PM

Why would variance be non-zero?
 
In question 6 on the homework, we are asked to compute the variance across all data sets.

If we are sampling uniformly from the interval [-1, 1] for the calculation of g_bar, as well as for each data set (g_d), why would the variance be anything but a very small quantity? In the general case, when the data sets are not drawn from a uniform distribution, a non-zero variance makes sense, but if there is sufficient overlap in the data sets, it makes intuitive sense that the variance should be close to zero.

I ask this because my simulation results support the above (potentially flawed) theory.

Please answer the question in general terms -- I don't care about the homework answer -- I was merely using that as an example.

Thanks for any input.

-Samir

yaser 08-01-2012 07:32 PM

Re: Why would variance be non-zero?
 
Quote:

Originally Posted by samirbajaj (Post 3781)
If we are sampling uniformly from the interval [-1, 1] for the calculation of g_bar, as well as for each data set (g_d), why would the variance be anything but a very small quantity? In the general case, when the data sets are not drawn from a uniform distribution, a non-zero variance makes sense, but if there is sufficient overlap in the data sets, it makes intuitive sense that the variance should be close to zero.

\bar g is the average of g^{({\cal D})} over different data sets {\cal D}. You will get different g^{({\cal D})}'s when you pick different {\cal D}'s, since the final hypothesis depends on the data set used for training. Therefore, there will be a variance that measures how different these g^{({\cal D})}'s are around their expected value \bar g (which does not depend on {\cal D} as {\cal D} gets integrated out in the calculation of \bar g).

This argument holds for any probability distribution, uniform or not, that is used to generate the different {\cal D}'s.

samirbajaj 08-02-2012 09:14 AM

Re: Why would variance be non-zero?
 
Thank you ... now that you explain it that way, it makes perfect sense. (Not sure what I was thinking...)

-Samir


All times are GMT -7. The time now is 03:53 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.