LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 4

Reply
 
Thread Tools Display Modes
  #1  
Old 08-01-2012, 05:41 PM
samirbajaj samirbajaj is offline
Member
 
Join Date: Jul 2012
Location: Silicon Valley
Posts: 48
Default Why would variance be non-zero?

In question 6 on the homework, we are asked to compute the variance across all data sets.

If we are sampling uniformly from the interval [-1, 1] for the calculation of g_bar, as well as for each data set (g_d), why would the variance be anything but a very small quantity? In the general case, when the data sets are not drawn from a uniform distribution, a non-zero variance makes sense, but if there is sufficient overlap in the data sets, it makes intuitive sense that the variance should be close to zero.

I ask this because my simulation results support the above (potentially flawed) theory.

Please answer the question in general terms -- I don't care about the homework answer -- I was merely using that as an example.

Thanks for any input.

-Samir
Reply With Quote
  #2  
Old 08-01-2012, 07:32 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,476
Default Re: Why would variance be non-zero?

Quote:
Originally Posted by samirbajaj View Post
If we are sampling uniformly from the interval [-1, 1] for the calculation of g_bar, as well as for each data set (g_d), why would the variance be anything but a very small quantity? In the general case, when the data sets are not drawn from a uniform distribution, a non-zero variance makes sense, but if there is sufficient overlap in the data sets, it makes intuitive sense that the variance should be close to zero.
\bar g is the average of g^{({\cal D})} over different data sets {\cal D}. You will get different g^{({\cal D})}'s when you pick different {\cal D}'s, since the final hypothesis depends on the data set used for training. Therefore, there will be a variance that measures how different these g^{({\cal D})}'s are around their expected value \bar g (which does not depend on {\cal D} as {\cal D} gets integrated out in the calculation of \bar g).

This argument holds for any probability distribution, uniform or not, that is used to generate the different {\cal D}'s.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #3  
Old 08-02-2012, 09:14 AM
samirbajaj samirbajaj is offline
Member
 
Join Date: Jul 2012
Location: Silicon Valley
Posts: 48
Default Re: Why would variance be non-zero?

Thank you ... now that you explain it that way, it makes perfect sense. (Not sure what I was thinking...)

-Samir
Reply With Quote
Reply

Tags
variance

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 12:52 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.