LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 4 (http://book.caltech.edu/bookforum/forumdisplay.php?f=133)
-   -   HW 4 question3 (http://book.caltech.edu/bookforum/showthread.php?t=421)

markweitzman 05-01-2012 06:03 PM

HW 4 question3
I am confused about question3 - are they not all above 1 and therefore essentially equivalent in the range 2-8. Or did I do the calculation incorrectly?

Mark Weitzman

Hillbilly 05-01-2012 06:18 PM

Re: HW 4 question3
I agree with the implication of your question -- isn't an epsilon greater than 1 essentially meaningless, making them all equivalent in that sense? Nevertheless, I got distinctly different curves for the four choices, strictly ordered, so I answered on that basis, and apparently that was the right perspective. Devroye gave me weird results on both extremes of N; N=2 particularly bizarre, but I may have a bug in it I haven't found yet.

markweitzman 05-01-2012 06:32 PM

Re: HW 4 question3
Well I thought like you did and calculated similar results, when I realized that all equivalent if all greater than 1 seems like the best result.

Mark Weitzman

silvrous 05-01-2012 08:45 PM

Re: HW 4 question3
I also got values larger than 1 for all of them, and therefore considered them to be equally meaningless for small N...

rohanag 05-01-2012 11:52 PM

Re: HW 4 question3
how are the recursive questions (part c and d) to be plotted?

IamMrBB 05-02-2012 01:07 AM

Re: HW 4 question3
I have the same question/remark as silvrous and markweitzman. Since epsilon bounds the absolute difference of two probabilities/probability measures/frequencies (at least that is what I understood from the class and a quick google lookup) a statement of epsilon < 3 (for example) is equivalent to the stamement epsilon <= 1. Since all bounds gave numbers in the ball park 3, I reasoned they are all equivalent to bounds epsilon <= 1, i.e. with this small number of examples we cannot say anything about Eout, at least not with a delta of 0.05 per the question.

I have to admit that I thougth long and hard about the what was the intention of the question: just to test if we can calculate these scary looking formulas, or to test our understanding of learning (in particular understanding that you need a minimum amount of data before you can make strong (delta = 5%) statements about the out of sample). Since the calculation aspect was already tested in q2, I hoped and guessed that q3 was aiming at the other aspect.

In the end I therefore went for answer e ("they are all equivalent"), which I thought was the most correct, although there was indeed a chance the question was intended differently.

Professor, or any other expert on the subject, am I correct in my assumption about that epsilon < 3 is equivalent to epsilon <= 1?

lucifirm 05-02-2012 01:36 AM

Re: HW 4 question3
What I did was to create a vector \epsilon of the same size of N, but varying from 0 to 1. I don't know if this is the best approach, but it helped me to plot the curves. The problem is I thought I had to chose only one correct answer, so I could not choose c or d, because for me they were both correct for large N.

I don't know if this will work, but here's a link to the figure.


rohanag 05-02-2012 04:23 AM

Re: HW 4 question3
thanks lucifirm, do you mean, you tried different values of \epsilon and then compared the left hand side and right hand side of the equations?
you'r right both c and d are the answers for large value of N. But for small values of N, c is the correct answer according to he key.

elkka 05-02-2012 04:34 AM

Re: HW 4 question3
I first thought the same thing about 1. But then, where do we see \varepsilon? It is the measure of difference between E_in and E_out, which can be small, and can be big depending on the experiment. Suppose you are talking about an experiment with very large numbers, like the number of minutes people use in a month on a cell phone (which, say, average 200). Than it is totally meaningful to consider a prediction that assures you that |E_{in} -E_{out}|<2 (or 5, or 10) with probability 0.95. So, it totally makes sense to rate the bounds even if they all are >1

elkka 05-02-2012 04:38 AM

Re: HW 4 question3
rohanag, the recursive bounds can easily be solved for \varepsilon, as 1. they are, essentially, quadratic equations, and 2. only one root is of interest, as \varepsilon>0. By solving the equations you get

(c) \varepsilon = \frac{1}{N}+\sqrt{\frac{1}{N^2}+\frac{1}{N}\ln{\frac{6m_\mathcal{H}(2N)}{\delta}}};

(d) \varepsilon = \frac{1}{N-2}+\sqrt{\frac{1}{(N-2)^2}+\frac{1}{2(N-2)}\ln{\frac{4m_\mathcal{H}(N^2)}{\delta}}}.

All times are GMT -7. The time now is 10:39 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.