 LFD Book Forum Discussion of the VC proof #11 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,477 Re: Discussion of the VC proof

Quote:
 Originally Posted by jokovc Hi Prof. Yaser. I have a problem with the proof of Lemma A.2., page 190. I don't understand what this part means
Here is what it means. Let's say that you have for every . Then, regardless of what the probability distribution of , it will be true that since we can multiply both sides of by and integrate out.
__________________
Where everyone thinks alike, no one thinks very much
#12
 jokovc Junior Member Join Date: Nov 2014 Posts: 2 Re: Discussion of the VC proof

Quote:
 Originally Posted by yaser Here is what it means. Let's say that you have for every . Then, regardless of what the probability distribution of , it will be true that since we can multiply both sides of by and integrate out.
I think this is so much clearer than when it is put in sentences. Thank you.
#13
 ilson Member Join Date: Sep 2015 Posts: 10 Re: Discussion of the VC proof

I was going through the proof in Appendix A and I just want to make sure that something written towards the bottom of pg. 189 is a typo.

Namely,

Quote:
 Inequality (A.3) folows because the events " " and " " (which is given) imply " ".

Quote:
 Inequality (A.3) folows because the events " " and " " (which is given) imply " ".
I'm 99.999999999999% sure this is indeed a typo since the latter case easily follows from reverse triangle inequality and it suffices to show the inequality in (A.3) and I cannot see how one can arrive at the implication in the former case nor how the former case implies the inequality (A.3), but it would ease my mind if I can get a verification that it is a typo. Thank you in advance!
#14 magdon RPI Join Date: Aug 2009 Location: Troy, NY, USA. Posts: 595 Re: Discussion of the VC proof

Yes, this is a typo. Thank you for pointing it out. You have it correct.

If A and C imply B, then Quote:
 Originally Posted by ilson I was going through the proof in Appendix A and I just want to make sure that something written towards the bottom of pg. 189 is a typo. Namely, should read I'm 99.999999999999% sure this is indeed a typo since the latter case easily follows from reverse triangle inequality and it suffices to show the inequality in (A.3) and I cannot see how one can arrive at the implication in the former case nor how the former case implies the inequality (A.3), but it would ease my mind if I can get a verification that it is a typo. Thank you in advance!
__________________
Have faith in probability
#15
 CharlesL Junior Member Join Date: Sep 2016 Location: Vancouver Posts: 1 Re: Discussion of the VC proof

Recently I have started reading the proof of the VC inequality in the appendix. On the bottom of page 190(Lemma A.3) , why does

sigma_S　P[S] x P[sup_h E_in - E_in' > ... |S ] <= sup_S P[sup_h E_in - E_in' >．．．|S ]?
(Sorry for the terrible notations, I don't know how I can input math symbols)

what does it mean by taking the supremum on S?
#16 htlin NTU Join Date: Aug 2009 Location: Taipei, Taiwan Posts: 601 Re: Discussion of the VC proof

Quote:
 Originally Posted by CharlesL Recently I have started reading the proof of the VC inequality in the appendix. On the bottom of page 190(Lemma A.3) , why does sigma_S　P[S] x P[sup_h E_in - E_in' > ... |S ] <= sup_S P[sup_h E_in - E_in' >．．．|S ]? (Sorry for the terrible notations, I don't know how I can input math symbols) what does it mean by taking the supremum on S?
All complicated math aside, supremum on S carries the physical meaning of taking the "maximum" value over all possible S.

So this inequality simply says an expected value (of P[sup_...]) is less than or equal to the maximum value.

Hope this helps.
__________________
When one teaches, two learn.
#17
 tayfun29 Junior Member Join Date: Oct 2016 Posts: 1 Re: Discussion of the VC proof

Quote:
 Originally Posted by htlin All complicated math aside, supremum on S carries the physical meaning of taking the "maximum" value over all possible S. So this inequality simply says an expected value (of P[sup_...]) is less than or equal to the maximum value. Hope this helps.
Thank...
__________________
[URL="http://www.merakligencler.net/kuran-i-kerimde-gecen-dualar/"][B]kuranı kerimde geçen dualar[/B][/URL]
#18
 gamelover623 Junior Member Join Date: Sep 2016 Posts: 1 Re: Discussion of the VC proof

I am a machine learning practitioner currently applying machine learning to algorithmic trading, yet highly interested in the theoretical grounds of the field.

I have read your book "Learning from data" from cover to cover. I haven't solved the problems though. I however did go through the proof of the VC bound in the appendix. I succeeded to understand most of it except (A.4) in the bottom of page 189. I can understand that you have applied Hoeffding Inequality to h*, but your explanation on how this applies to h* conditioned to the sup_H event, is hard to grasp for me.

Can you please give more explanation on how using Hoeffding (A.4) holds ? Or give a reference that helps clarifying this result?

__________________
#19
 CountVonCount Member Join Date: Oct 2016 Posts: 17 Re: Discussion of the VC proof

Hi,

I have a question about the sentence on page 190:
Quote:
 Note that we can assume e^(-0.5*N*eps^2) < 1/4, because otherwise the bound in Theorem A.1 is trivially true.
While I understand the argument here, I don't understand, why it is especially the value 1/4?
When set the above term to 1/4 I will receive -2*ln(1/4) as value for N*eps^2.
Now I can set N*eps^2 to that value in Theorem A.1 and I will get on the RHS (assuming the growth function is just 1) 4*0,707... so it is much more than 1.

A value of 1 in the RHS would be sufficient to say the bound in Theorem A.1 is trivially true. And this would assume, that the above term is less than 1/256.
With this in mind 1 - 2*e^(-0.5*N*eps^2) is greater than 0,99... and thus instead of a 2 in the lemmas outcome, I would receive a value around 1, which is a much better outcome.

So why is the value 1/4 chosen for the assumption?

Best regards,
André
#20 magdon RPI Join Date: Aug 2009 Location: Troy, NY, USA. Posts: 595 Re: Discussion of the VC proof

Suppose Then, .

In which case and the bound in Theorem A.1 is trivial.

Quote:
 Originally Posted by CountVonCount Hi, I have a question about the sentence on page 190: While I understand the argument here, I don't understand, why it is especially the value 1/4? When set the above term to 1/4 I will receive -2*ln(1/4) as value for N*eps^2. Now I can set N*eps^2 to that value in Theorem A.1 and I will get on the RHS (assuming the growth function is just 1) 4*0,707... so it is much more than 1. A value of 1 in the RHS would be sufficient to say the bound in Theorem A.1 is trivially true. And this would assume, that the above term is less than 1/256. With this in mind 1 - 2*e^(-0.5*N*eps^2) is greater than 0,99... and thus instead of a 2 in the lemmas outcome, I would receive a value around 1, which is a much better outcome. So why is the value 1/4 chosen for the assumption? Best regards, André
__________________
Have faith in probability Thread Tools Show Printable Version Email this Page Display Modes Linear Mode Switch to Hybrid Mode Switch to Threaded Mode Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 07:16 AM. The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.