LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 1

Thread Tools Display Modes
Prev Previous Post   Next Post Next
Old 01-12-2013, 02:39 PM
ArikB ArikB is offline
Junior Member
Join Date: Oct 2012
Posts: 8
Default Isn't the bin (your data set) the sample?

This has me a bit confused, isn't the bin your data set in the analogy? And as such your data set is the sample of the population. For instance in the bank example, your data set would be the sample and the population would be all of the possible people applying for credit.

If that is the case then how is Hoeffding representative for anything that is "really" out of sample?

Or am I confused and the bin is really the population? Hence mu is then the population fraction and the samples you pick from the bin represent the data set?

Perhaps I should rephrase it to be a bit more systematic:

In the best case scenario my bin is completely green, i.e my hypothesis agrees entirely with my data set. So mu is then 1. Hoeffding gives me a probabilistic bound on how well nu approximates this mu (which is 1). That's nice, but now I only know that I have a hypothesis that agrees entirely with my dataset. But how does this generalize beyond my data set? Or am I going too far ahead and the lecture is not about this? If so, then why use nu at all? If this is a case of supervised learning and I know the output, then I can just see mu immediately because I can see whether or not my input agrees with the output within my data set.

Or is it so that the bin represents my training set and I have already (supposedly) subdivided my data set into a training set and data that I use for testing?
Reply With Quote

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT -7. The time now is 12:56 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.