LFD Book Forum Q10 higher bound
 User Name Remember Me? Password
 FAQ Calendar Mark Forums Read

 Thread Tools Display Modes
#21
04-30-2013, 02:21 PM
 Elroch Invited Guest Join Date: Mar 2013 Posts: 143
Re: Q10 higher bound

Quote:
 Originally Posted by jforbes Wouldn't it then be the case that the union of the hypothesis sets could only shatter two points? I'm still curious about what went wrong if H1 does in fact include --+, -+-, ... Is this sort of hypothesis set disallowed for some reason?
There are no rules constraining the specific hypotheses in a hypothesis set. It can be any subset of the power set of the underlying set of points.
#22
04-30-2013, 06:31 PM
 jforbes Member Join Date: Apr 2013 Posts: 12
Re: Q10 higher bound

Quote:
 Originally Posted by Elroch There are no rules constraining the specific hypotheses in a hypothesis set. It can be any subset of the power set of the underlying set of points.
OK, so if we assume that H is such that it "takes any 1 point and sets it to + or -, with the other points--if any--getting the opposite sign" as proposed by nkatz, then:
for N=1, H={+,-} and shatters this 1 point
for N=2, H={+-,-+} and does not shatter these two points.
for N=3, H={++-, +-+, -++, --+, -+-, +--} and does shatter two points.

What is the VC dimension of this hypothesis set? From the N=2 case, I might conclude that it's 1 (it fails to shatter two points), but from the N=3 case I might conclude it's 2 (it shatters 2 of the points within this set of points). Where am I going wrong?
#23
04-30-2013, 08:56 PM
 marek Member Join Date: Apr 2013 Posts: 31
Re: Q10 higher bound

Since the homework set was now due, I was wondering if the course staff could weigh in on this question. Did our discussion adequately cover this question or were there any nuances that we missed?
#24
04-30-2013, 11:04 PM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,477
Re: Q10 higher bound

Quote:
 Originally Posted by marek Since the homework set was now due, I was wondering if the course staff could weigh in on this question. Did our discussion adequately cover this question or were there any nuances that we missed?
The discussion has been great. I won't talk about the answer which you all know by now as I want to keep this thread rather than archive it with *ANSWER* threads.
__________________
Where everyone thinks alike, no one thinks very much
#25
05-01-2013, 05:41 AM
 Elroch Invited Guest Join Date: Mar 2013 Posts: 143
Re: Q10 higher bound

Quote:
 Originally Posted by jforbes OK, so if we assume that H is such that it "takes any 1 point and sets it to + or -, with the other points--if any--getting the opposite sign" as proposed by nkatz, then: for N=1, H={+,-} and shatters this 1 point for N=2, H={+-,-+} and does not shatter these two points. for N=3, H={++-, +-+, -++, --+, -+-, +--} and does shatter two points. What is the VC dimension of this hypothesis set? From the N=2 case, I might conclude that it's 1 (it fails to shatter two points), but from the N=3 case I might conclude it's 2 (it shatters 2 of the points within this set of points). Where am I going wrong?
Unquestionably, the VC-dimensions of the first two hypothesis sets are 1, and that of the third hypothesis set is 2, by the definition.
#26
05-01-2013, 09:24 AM
 Elroch Invited Guest Join Date: Mar 2013 Posts: 143
Re: Q10 higher bound

There is a proviso to my glib claim above that absolutely any subset can be a hypothesis. While this is true when the underlying set is finite or countable, when it is uncountably infinite I believe it only makes sense to have hypotheses which are measurable sets, according to the probability distribution associated with samples.

[Almost anything that makes any sense is likely to satisfy this, but it is not difficult to construct pathological examples even when the sample space is as simple as the unit interval with the uniform distribution].
#27
05-01-2013, 05:01 PM
 OlivierB Member Join Date: Apr 2013 Location: Paris Posts: 16
Re: Q10 higher bound

Quote:
 Originally Posted by marek I am still confused on how to piece everything together. I like your much cleaner version, but I do have one comment. Having disjoint hypothesis sets does not necessarily mean that the set of dichotomies they create will also be disjoint. For example, let H1 and H2 be the positive and negative 1d rays, respectively. These two hypothesis sets are disjoint. However, given any data set they can both create the dichotomies of all +1 or all -1. They won't have much overlap beyond that (and maybe THAT is the point), but they won't be entirely disjoint as far as our inequalities are concerned.
Quote:
 Originally Posted by TTHotShot Great work everyone - this question had me scratching my head until I found this post. The one thing I don't understand is the above statment. Am I missing something? It seems like it should be an inequality.
@ marek, TTHotShot: Well, I agree with your remark, I was too quick: I have not explored the link between a hypothesis set and the dichotomies it generates on a set of points well enough. As a consequence the assertion that in general is wrong in general. Instead, it should be an inequality (≤), and probably quite a loose one, since there can be an overlap in the dichotomies generated by and . Additionally, the definition of involves the determination of a maximum, and the independent search of 2 max on the RHS is a lot less constraining than one max on the LHS.
#28
05-01-2013, 05:35 PM
 marek Member Join Date: Apr 2013 Posts: 31
Re: Q10 higher bound

Quote:
 Originally Posted by OlivierB @ marek, TTHotShot: Well, I agree with your remark, I was too quick: I have not explored the link between a hypothesis set and the dichotomies it generates on a set of points well enough. As a consequence the assertion that in general is wrong in general. Instead, it should be an inequality (≤), and probably quite a loose one, since there can be an overlap in the dichotomies generated by and . Additionally, the definition of involves the determination of a maximum, and the independent search of 2 max on the RHS is a lot less constraining than one max on the LHS.
I agree that the inequality here feels rather loose, as that is what troubled me initially (since it feels like we can do better). However once we were able to generate examples of hypothesis sets that did result in disjoint dichotomies, and since those examples could readily be extended to broad families of such examples, that seals the deal. We are proving a general case argument. So even though many times we should expect to be well under the bound, those special examples that reach equality prevent us from making the bound any tighter.
#29
03-04-2016, 07:15 AM
 khohi Member Join Date: Dec 2015 Posts: 10
Re: Q10 higher bound

Great job

علاج تساقط الشعر

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 11:14 AM.

 Contact Us - LFD Book - Top