LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 2

Reply
 
Thread Tools Display Modes
  #1  
Old 04-12-2013, 04:46 AM
Ziad Hatahet Ziad Hatahet is offline
Member
 
Join Date: Apr 2013
Location: San Francisco, CA
Posts: 23
Default Question 2

Hi all,

Is it correct to assume that each one of the thousand coins is its own "bin" in the multiple bin scenario of the Hoeffding inequality as explained in lecture 2, and therefore, is a separate hypothesis from the other bins?

Prof. Yaser mentioned in another post (see here) that:

Quote:
If you fix any hypothesis then run the experiment, the probability will always be bounded by that term.
I presume the dual does not apply (i.e. not fixing the hypothesis will result in it not being bound by the Hoeffding inequality), regardless of whether or not it happens to be bound by chance. Is that a valid assumption?
Reply With Quote
  #2  
Old 04-12-2013, 12:48 PM
jlaurentum jlaurentum is offline
Member
 
Join Date: Apr 2013
Location: Venezuela
Posts: 41
Default Re: Question 2

I am hung up on this question too.

As I interpret it, a "bin" in this question corresponds to a particular way for defining a random experiment. In any case, we flip 1000 coins 10 times each and:

1) the random experiment for coin 1 consists of taking the 10 flips for that coin and averaging the number of heads, reporting that average as the sample frequency.
2) For the random coin, the random experiment consists in taking a random coin (1-1000) and averaging out the number of heads in the 10 flips for that coin.
3) for the "minimum number of heads coin", we flip 1000 coins 10 times each as before, but take the minimum frequency of heads obtained among the 1000 coins as the reported sample frequency.

Each of these 3 situations constitutes a random experiment because it can be replicated any number of times. Each of the three scenarios has an expected frequency. The first two have an expected frequency (mu) of 0.5, the third has a different expected frequency but it can be calculated analytically (the probability distribution for the possible sample frequencies 0, 0.1, ..., 0.9, 1 can be determined for the "minimum heads" coin). The sample frequencies converge to the respective expected frequencies for the three scenarios, according to the law of large numbers. (No surprise)...

So back to this:

Quote:
If you fix any hypothesis then run the experiment, the probability will always be bounded by that term.
Does fixing the hypothesis mean fixing the coins, or does it mean being able to fix a random experiment (procedure) by which we obtain the sample frequencies?
Reply With Quote
  #3  
Old 04-12-2013, 01:01 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: Question 2

Quote:
Originally Posted by jlaurentum View Post
Does fixing the hypothesis mean fixing the coins, or does it mean being able to fix a random experiment (procedure) by which we obtain the sample frequencies?
Let me focus on the correspondence between the bin world and the hypothesis world since this seems to be a source of confusion.

The bin corresponds to the input space {\cal X}, with each marble corresponding to an input point {\bf x}\in{\cal X}.

The colors in a bin correspond to how a hypothesis agrees with the target; the color of a marble being red if h({\bf x})\ne f({\bf x}) for that 'marble' {\bf x}.

Fixed hypothesis means the colors inside the bin are fixed prior to drawing the sample. What makes a hypothesis 'not fixed' is that we have a bunch of bins and we select a bin according to the sample it produced (e.g., an all green sample). The reason we call the hypothesis not fixed in this case is because which bin we pick (hence which colors are inside, hence which hypothesis we are talking about) depends on the samples that have already been produced.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #4  
Old 04-13-2013, 01:12 AM
Ziad Hatahet Ziad Hatahet is offline
Member
 
Join Date: Apr 2013
Location: San Francisco, CA
Posts: 23
Default Re: Question 2

Quote:
Originally Posted by yaser View Post
What makes a hypothesis 'not fixed' is that we have a bunch of bins and we select a bin according to the sample it produced (e.g., an all green sample). The reason we call the hypothesis not fixed in this case is because which bin we pick (hence which colors are inside, hence which hypothesis we are talking about) depends on the samples that have already been produced.
That is what we do when we sift through the bins (hypotheses) looking for an h whose sample we got minimizes E_{in}. But given the quote in my first post, doesn't it mean that the bin chosen by this procedure is not bound by 2e^{-2\epsilon^2N}?
Reply With Quote
  #5  
Old 04-13-2013, 01:23 AM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: Question 2

Quote:
Originally Posted by Ziad Hatahet View Post
But given the quote in my first post, doesn't it mean that the bin chosen by this procedure is not bound by 2e^{-2\epsilon^2N}?
This is discussed here:

http://book.caltech.edu/bookforum/sh...0360#post10360

The main point is that once you consider the sample, the probability becomes conditional on how this sample came out, and that could violate the 2e^{-2\epsilon^2N} bound, whereas the probability before a sample was drawn always obeys the bound.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #6  
Old 04-14-2013, 01:58 PM
sptripathi sptripathi is offline
Junior Member
 
Join Date: Apr 2013
Posts: 8
Default Re: Question 2

Quote:
Originally Posted by yaser View Post
This is discussed here:

http://book.caltech.edu/bookforum/sh...0360#post10360

The main point is that once you consider the sample, the probability becomes conditional on how this sample came out, and that could violate the 2e^{-2\epsilon^2N} bound, whereas the probability before a sample was drawn always obeys the bound.
Is the Cmin coin of Q2 (HW2) not analogous to pocket PLA's final-hypothesis g ? In pocket PLA, we always choose the hypothesis which gave min-disagreement across experiments(~ bins). The probability of bad-event in pocket case is still bound by hoeffdings inequality, right ?
The possible difference I can see is that in Cmin case, sample kept changing, while in pocket case sample does not change, only color of marbles inside-bin and outside bin changes, as the hypothesis changes. Is that the key point?
Reply With Quote
  #7  
Old 04-14-2013, 02:57 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: Question 2

Quote:
Originally Posted by sptripathi View Post
In pocket PLA, we always choose the hypothesis which gave min-disagreement across experiments(~ bins). The probability of bad-event in pocket case is still bound by hoeffdings inequality, right ?
It is bounded by the multiple-bin version of Hoeffding inequality. If we apply that version of the inequality directly to this case, the result will be meaningless since there is an infinite number of bins (hypotheses) in this case. This is the reason we will develop the theory further in the coming two weeks; to deal with this dilemma.

Quote:
The possible difference I can see is that in Cmin case, sample kept changing, while in pocket case sample does not change, only color of marbles inside-bin and outside bin changes, as the hypothesis changes.
Your observation is correct, and this is why we used the union bound which is valid for all cases regardless of the dependence or independence of the samples.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 08:45 AM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.