![]() |
Discussion of the VC proof
Quote:
|
Re: The VC Proof
Quote:
|
Re: The VC Proof
I tend to be closer to an "experimental" scientist. Most often, I don't want to read the proof, though it is nice to know that it is there.
The level covered in the course was just about right for me. I believe I followed it well enough to do the problems, though. Thanks! |
Re: The VC Proof
Quote:
For your question I would like to say: Due to the mixed mathematical background of any classroom, include the formal proof in your classroom presentations. Further, I suggest that you extract the proof from the appendix and include it in the text in Chapter 2, since I think it would be more natural that you develop VC-bound within the lecture formally. I never like God send formulas, I like to understand the natural development of these ideas and learn how mathematicians transform these ideas into the formulas without any break. A deferred proof, break this sequence force us to remember what was referring what. If you think the opposite, than enhance your Appendix so that the derivations are given step by step with helping explanations beyond the formulas. Current format of the proof requires the student create this sequence by her/himself. I want to know what is the natural historical development sequence of these ideas. Maybe I should refer the VC-book, Statistical Learning Theory. But, I need the help of an instructor like you, since reading a 750 pages book is infeasible; especially if you yearn for writing your first ML applications within the next 3 months. For human being english is always better then notations, but notations are inevitable to avoid ambiguities; thus a combination of both is the ideal format. I see you try to achieve this ideal. This is good, so. At the moment that I wrote this message, I've watched the 6. video and read the book until Chp 2, section 2.3 and I read the two pages of the proof in appendix. Until now, two thinks broke me disturbing my mind; how the Hoeffding found his from God send inequality, and how can I understand the long derivations of VC-Bound including the mathematical technicalities you mention in the book and videos. I'm still working on the second one. And try to accept Hoeffding's inequality as given (which disturbs my mind like a bug in my brain and decelerates my learning). I know you want to show us the forest instead of dealing with the leaves of trees, but you can do this like you have done in the book between th pages 46-49 using "safe skip" blocks. A last word, this is the first ML course that I could see the whole picture, the forest. As such this is the first successful attempt that I experienced. Thank you all, for your effort. |
Re: The VC Proof
Quote:
|
Re: The VC Proof
Quote:
|
Re: The VC Proof
I am a machine learning practitioner currently applying machine learning to algorithmic trading, yet highly interested in the theoretical grounds of the field.
I have read your book "Learning from data" from cover to cover. I haven't solved the problems though. I however did go through the proof of the VC bound in the appendix. I succeeded to understand most of it except (A.4) in the bottom of page 189. I can understand that you have applied Hoeffding Inequality to h*, but your explanation on how this applies to h* conditioned to the sup_H event, is hard to grasp for me. Can you please give more explanation on how using Hoeffding (A.4) holds ? Or give a reference that helps clarifying this result? Thanks. |
Re: The VC Proof
Quote:
![]() ![]() ![]() ![]() ![]() |
Re: Discussion of the VC proof
Hi Prof. Yaser. I have a problem with the proof of Lemma A.2., page 190. I don't understand what this part means:
Quote:
|
Re: Discussion of the VC proof
Quote:
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Re: Discussion of the VC proof
I think this is so much clearer than when it is put in sentences. Thank you.
|
Re: Discussion of the VC proof
I was going through the proof in Appendix A and I just want to make sure that something written towards the bottom of pg. 189 is a typo.
Namely, should read I'm 99.999999999999% sure this is indeed a typo since the latter case easily follows from reverse triangle inequality and it suffices to show the inequality in (A.3) and I cannot see how one can arrive at the implication in the former case nor how the former case implies the inequality (A.3), but it would ease my mind if I can get a verification that it is a typo. Thank you in advance! |
Re: Discussion of the VC proof
Yes, this is a typo. Thank you for pointing it out. You have it correct.
If A and C imply B, then ![]() Quote:
|
Re: Discussion of the VC proof
Recently I have started reading the proof of the VC inequality in the appendix. On the bottom of page 190(Lemma A.3) , why does
sigma_S P[S] x P[sup_h E_in - E_in' > ... |S ] <= sup_S P[sup_h E_in - E_in' >...|S ]? (Sorry for the terrible notations, I don't know how I can input math symbols) what does it mean by taking the supremum on S? |
Re: Discussion of the VC proof
Quote:
So this inequality simply says an expected value (of P[sup_...]) is less than or equal to the maximum value. Hope this helps. |
Re: Discussion of the VC proof
Quote:
|
Re: Discussion of the VC proof
I am a machine learning practitioner currently applying machine learning to algorithmic trading, yet highly interested in the theoretical grounds of the field.
I have read your book "Learning from data" from cover to cover. I haven't solved the problems though. I however did go through the proof of the VC bound in the appendix. I succeeded to understand most of it except (A.4) in the bottom of page 189. I can understand that you have applied Hoeffding Inequality to h*, but your explanation on how this applies to h* conditioned to the sup_H event, is hard to grasp for me. Can you please give more explanation on how using Hoeffding (A.4) holds ? Or give a reference that helps clarifying this result? Openload Movies Free Download Movies HD Online |
Re: Discussion of the VC proof
Hi,
I have a question about the sentence on page 190: Quote:
When set the above term to 1/4 I will receive -2*ln(1/4) as value for N*eps^2. Now I can set N*eps^2 to that value in Theorem A.1 and I will get on the RHS (assuming the growth function is just 1) 4*0,707... so it is much more than 1. A value of 1 in the RHS would be sufficient to say the bound in Theorem A.1 is trivially true. And this would assume, that the above term is less than 1/256. With this in mind 1 - 2*e^(-0.5*N*eps^2) is greater than 0,99... and thus instead of a 2 in the lemmas outcome, I would receive a value around 1, which is a much better outcome. So why is the value 1/4 chosen for the assumption? Best regards, André |
Re: Discussion of the VC proof
Suppose
![]() Then, ![]() In which case ![]() Quote:
|
Re: Discussion of the VC proof
I have also another question on the same page (190):
At the end of the page there is the formula: https://latex.codecogs.com/gif.latex...pace;S&space;] I don't understand why the RHS is greater or equal to the LHS. The only legitimation I see for this is, that the distribution of P[S] is uniform, but this has not been stated in the text. Or do I oversee here anything and this is also valid for all kinds of distribution? |
Re: Discussion of the VC proof
Quote:
thanks for the answer. I understand this argument, however this holds also for ![]() or for ![]() Thus the value 1/4 is somehow magic for me. Edit: I think you choose 1/4 because it is so easy to see, that the RHS of Theorem A.1 gets 1. Nevertheless with a different value you would get a different outcome of the final formula. |
Re: Discussion of the VC proof
Quote:
When we have a uniform distribution of P[S] the outcome of the product-sum ![]() is simply the average of all P[A|S], since ![]() And an average is of course less than or equal to the maximum of P[A|S]. If P[S] is not distributed uniformly we still have an average, but a weighted average. But also here the result is always less than or equal to the maximum. Because you cannot find weighting factors that are in sum 1 but will lead to a higher result as the maximum P[A|S]. |
Re: Discussion of the VC proof
Correct.
Quote:
|
Re: Discussion of the VC proof
You are correct. We could have made other assumptions.
But there is a special reason why we DO want 1. Because we are bounding the probability, and so there is nothing to prove if we claim that a probability is less equal to 1. So, whenever the RHS (i.e. the bound) evaluates to 1 or bigger, there is nothing to prove. So we only need to consider the case when the bound evaluates to less than 1. Quote:
|
Re: Discussion of the VC proof
Thank you very much for your answers. This helps me a lot to understand the intention of the single steps of this prove.
|
Re: Discussion of the VC proof
i had same question but thanks to your website my question just solved. thanks
|
Re: Discussion of the VC proof
thank you for your answer. i like it.
|
Re: Discussion of the VC proof
at the beginning everything was complicated, so at first, I gave it a thought, but I could take in nothing. by the way that was pretty easy. thank you so much.
|
All times are GMT -7. The time now is 02:14 AM. |
Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.