LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 1 (http://book.caltech.edu/bookforum/forumdisplay.php?f=130)
-   -   M=|H|? (Lecture 2 slide 16-17) (http://book.caltech.edu/bookforum/showthread.php?t=3850)

Haowen 01-14-2013 06:33 PM

M=|H|? (Lecture 2 slide 16-17)
 
I have a question regarding the value of M in the multiple-bins Hoeffding bound slides.

M is supposed to be the number of different alternate hypotheses considered by the learning algorithm.

At the same time, H is the space of possible hypotheses that can be considered by the algorithm (e.g., all linear functions, etc).

I keep going back and forth in my mind about whether M=|H|.

Specifically, suppose that for a SPECIFIC training set X, after looking at the data points in X, the algorithm only explored some subset of H, say G with |G| < |H|.

Would it then be correct to set M = |G| and say that for the specific training set X, the probability of the hypothesis being bad is at most 2|G|*the hoeffding bound ? Or would this be incorrect since the theorem only deals with the behavior of the system over all possible X with the distribution P.

Thanks!

yaser 01-14-2013 06:55 PM

Re: M=|H|? (Lecture 2 slide 16-17)
 
Quote:

Originally Posted by Haowen (Post 8659)
I have a question regarding the value of M in the multiple-bins Hoeffding bound slides.

M is supposed to be the number of different alternate hypotheses considered by the learning algorithm.

At the same time, H is the space of possible hypotheses that can be considered by the algorithm (e.g., all linear functions, etc).

I keep going back and forth in my mind about whether M=|H|.

Specifically, suppose that for a SPECIFIC training set X, after looking at the data points in X, the algorithm only explored some subset of H, say G with |G| < |H|.

Would it then be correct to set M = |G| and say that for the specific training set X, the probability of the hypothesis being bad is at most 2|G|*the hoeffding bound ? Or would this be incorrect since the theorem only deals with the behavior of the system over all possible X with the distribution P.

Thanks!

You raise interesting points. First, indeed M=|{\cal H}|. Second, if the algorithm does not fully explore the hypothesis set {\cal H}, then M is still a working upper bound as far as generalization from in-sample to out-of-sample is concerned. Third, the analysis fixes {\cal H} before the data set {\cal D} is presented, and is done independently of the probability distribution P, i.e., the same bound applies regardless of which P is the true distribution.

In some cases, we can find a better (read: smaller) upper bound, such as in regularization which will be studied later in the course.

Haowen 01-15-2013 08:59 AM

Re: M=|H|? (Lecture 2 slide 16-17)
 
Thank you Professor for your reply. It makes sense now. I think trying to consider the space of hypothesis "actually explored" is not that useful, as you said, the space of possible hypothesis {\cal H} is independent of P and {\cal D} and the resultant bound is a much more general and useful characterization of the learning model.


All times are GMT -7. The time now is 01:57 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.