LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 4 (http://book.caltech.edu/bookforum/forumdisplay.php?f=133)
-   -   Q10 higher bound (http://book.caltech.edu/bookforum/showthread.php?t=4241)

OlivierB 04-24-2013 02:07 PM

Q10 higher bound
 
I think I have found the tighter lower bound.
But I cannot determine the tighter higher bound, among the two candidates.

From the tighter lower bound that I have found and the information in the question, it seems possible to determine the tighter higher bound.

But I would like to either properly admit or dismiss the K-1 factor.
Can anybody more advanced on the subject propose an indication, a clue, a line of reasoning ? (NOT the answer)

Thx

Michael Reach 04-24-2013 06:20 PM

Re: Q10 higher bound
 
One helpful step is to check the problems in the book. :)

yaser 04-24-2013 07:03 PM

Re: Q10 higher bound
 
Quote:

Originally Posted by OlivierB (Post 10583)
Can anybody more advanced on the subject propose an indication, a clue, a line of reasoning?

Let me just comment that this problem is probably the hardest in the entire course. A discussion from the participants is most welcome.

alasdairj 04-27-2013 03:35 PM

Re: Q10 higher bound
 
In 2d, a positive "2d ray" in the x direction can shatter 1 point, and a positive "2d ray" in the y direction can shatter 1 point, whereas their union can shatter 2 points...

So at least I can feel comfortable that the union of hypothesis sets *can* achieve a VC dimension which is the sum of the VC dimensions of its parts, but can it exceed it??

marek 04-28-2013 12:12 PM

Re: Q10 higher bound
 
Quote:

Originally Posted by alasdairj (Post 10614)
In 2d, a positive "2d ray" in the x direction can shatter 1 point, and a positive "2d ray" in the y direction can shatter 1 point, whereas their union can shatter 2 points...

So at least I can feel comfortable that the union of hypothesis sets *can* achieve a VC dimension which is the sum of the VC dimensions of its parts, but can it exceed it??

I am struggling with this question as well. Based on this discussion, I can intuit what the right answer should be, but I am struggling with the justification.

I figure if I can simply come up with an H1 and H2 such that d(H1 U H2) = d(H1) + d(H2) + 1, I would be done. I can come up 1 short of that (much like your example, even consider 1d positive and negative rays). But I cannot think of an example that would hit the higher bound.

That being said, I decided to just play around with the growth function. We know that m_H(N) \leq \sum_{i=0}^{d} {N \choose i}. If all the terms in that expansion are present, we get 2^N and hence can shatter. The binomial expansion is symmetric, so if I had two growth functions that were "disjoint" one could get the terms from the bottom and the other could get terms from the top and these could together give the full expansion.

More formally:\sum_{i=0}^{d_1} {N \choose i} + \sum_{i=0}^{d_2} {N \choose i} = \sum_{i=0}^{d_1} {N \choose i} + \sum_{i=N-d_2}^{N} {N \choose i}. This gives the full expansion if the second sum starts off one term after the first sum ends, specifically N-d_2 \leq d_1 + 1 \Leftrightarrow N \leq d_1 + d_2 + 1.

This simple analysis gives me my desired d_1 + d_2 + 1. But clearly something is missing. How can growth functions be "disjoint"? Any example I come up with, there is a significant overlap in dichotomies. I feel like I'm definitely on the right track here, but I also feel like something is clearly missing. Maybe B(N,k) is a better starting point than the growth function?

Hopefully someone can pick up the ball and carry it over the finish line...

OlivierB 04-28-2013 03:20 PM

Re: Q10 higher bound
 
@marek, Thanks for your contribution.
Let me try and rephrase, not adding much.

For a hypothesis set H with VD dimension d, we have
m_{H}(N)\le B(N, d+1)=\sum_{i=0}^d \left(\begin{array}{c}N\\ i\end{array}\right)

The second equality is the result of 2 inequalities in opposite direction. In class, we saw '≤' and '≥' is the subject of problem 2.4 in the book.

Let 2 hypothesis sets H_{1} and H_{2} with VC dimensions respectively d_{1} and d_{2}.

We can construct two disjoint H'_{1} and H'_{2} putting the intersection of both with one of them only.
H'_{1}=H_{1}
H'_{2}=H_{2}-\bigcap(H_{1}, H_{2})

The VC dimension of H'_{1} is d_{1}
The VC dimension of H'_{2} is d'_{2}\le d_{2}

Let H be the union of these hypothesis sets:
H=\bigcup(H_{1}, H_{2})=\bigcup(H'_{1}, H'_{2})
Let N=d_{1}+d_{2}+1

Now in order to try and determine the VC dimension of H, let us compute a higher bound of m_{H}(N):

m_{H}(N)=m_{H'_1}(N)+m_{H'_2}(N)\le\sum_{i=0}^{d_1}\left(\begin{array}{c}N\\ i\end{array}\right)+\sum_{i=0}^{d'_2}\left(\begin{array}{c}N\\ i\end{array}\right)\le\sum_{i=0}^{d_1}\left(\begin{array}{c}N\\ i\end{array}\right)+\sum_{i=0}^{d_2}\left(\begin{array}{c}N\\ i\end{array}\right)

which we can simplify:

\sum_{i=0}^{d_1}\left(\begin{array}{c}N\\ i\end{array}\right)+\sum_{i=0}^{d_2}\left(\begin{array}{c}N\\ i\end{array}\right)=\sum_{i=0}^{d_1}\left(\begin{array}{c}N\\ i\end{array}\right)+\sum_{i=N-d_2}^{N}\left(\begin{array}{c}N\\ i\end{array}\right)=\sum_{i=0}^{d_1}\left(\begin{array}{c}N\\ i\end{array}\right)+\sum_{i=d_1+1}^{N}\left(\begin{array}{c}N\\ i\end{array}\right)

=2^N

For m_{H}(N)=2^N to be true i.e. for H to shatter N, all inequalities must be equalities.
In other words, we must have:
1/ The growth functions of H_{1} and H_{2} must be exactly B(N, d_1+1) and B(N, d_2+1)
2/ Removing the intersection of H_{1} and H_{2} from H_{2} does not decrease the VC dimension of H_{2}. If the intersection is empty then this condition holds.

If these 2 requirements are not contradictory (which seems plausible but I cannot prove - neither can I visualize with an example), then the VC dimension of H is at least d_1+d_2+1.

Now unless there is a mistake in the reasoning, the question is: Can these requirements be met ? Ideally via an example, because beyond the abstract equations, I would really like to 'visualize' a case where these inequalities are 'saturated'.

marek 04-28-2013 06:08 PM

Re: Q10 higher bound
 
I am still confused on how to piece everything together. I like your much cleaner version, but I do have one comment. Having disjoint hypothesis sets does not necessarily mean that the set of dichotomies they create will also be disjoint.

For example, let H1 and H2 be the positive and negative 1d rays, respectively. These two hypothesis sets are disjoint. However, given any data set they can both create the dichotomies of all +1 or all -1. They won't have much overlap beyond that (and maybe THAT is the point), but they won't be entirely disjoint as far as our inequalities are concerned.

MindExodus 04-28-2013 09:56 PM

Re: Q10 higher bound
 
Your requirement 2 can be simply rendered from Q9 that the VC dimension of intersection of hypothesis could not exceed any one of them(hope I got Q9 right)

As for requirement 1, I come up with the case that
H1 := mapping all points to +1
H2 := mapping all points to -1
thus d_1 = 0 , d_2 = 0 , d_{1+2} = 1 = d_1 + d_2 + 1


Quote:

Originally Posted by OlivierB (Post 10620)
@marek, Thanks for your contribution.
Let me try and rephrase, not adding much.

.......

For m_{H}(N)=2^N to be true i.e. for H to shatter N, all inequalities must be equalities.
In other words, we must have:
1/ The growth functions of H_{1} and H_{2} must be exactly B(N, d_1+1) and B(N, d_2+1)
2/ Removing the intersection of H_{1} and H_{2} from H_{2} does not decrease the VC dimension of H_{2}. If the intersection is empty then this condition holds.

If these 2 requirements are not contradictory (which seems plausible but I cannot prove - neither can I visualize with an example), then the VC dimension of H is at least d_1+d_2+1.

Now unless there is a mistake in the reasoning, the question is: Can these requirements be met ? Ideally via an example, because beyond the abstract equations, I would really like to 'visualize' a case where these inequalities are 'saturated'.


marek 04-28-2013 10:59 PM

Re: Q10 higher bound
 
Quote:

Originally Posted by MindExodus (Post 10631)
As for requirement 1, I come up with the case that
H1 := mapping all points to +1
H2 := mapping all points to -1
thus d_1 = 0 , d_2 = 0 , d_{1+2} = 1 = d_1 + d_2 + 1

Fantastic! I can sleep well tonight =)

jforbes 04-29-2013 01:24 AM

Re: Q10 higher bound
 
Quote:

Originally Posted by MindExodus (Post 10631)
Your requirement 2 can be simply rendered from Q9 that the VC dimension of intersection of hypothesis could not exceed any one of them(hope I got Q9 right)

As for requirement 1, I come up with the case that
H1 := mapping all points to +1
H2 := mapping all points to -1
thus d_1 = 0 , d_2 = 0 , d_{1+2} = 1 = d_1 + d_2 + 1

Bad news: the problem requires that each hypothesis set has a finite positive VC dimension, so this particular example does not guarantee that the maximum is greater than the sum of the VC dimension for the case we're being asked about.

I'm still on the fence for this question - my first attempts at constructing an example similar to this where each hypothesis set had a VC dimension of 1 failed to shatter 3 points, but I'm not 100% sure yet that I can't get it to work.

MindExodus 04-29-2013 04:06 AM

Re: Q10 higher bound
 
sry about didn't notice that requirement
but the key to this problem is the same:

let me just put another example:

just think of H1 and H2 on \{x_1, x_2, x_3\}
H1:
\{x_1 = 1, x_2 = 1, x_3 = 1\}
\{x_1 = 1, x_2 = 1, x_3 = -1\}
\{x_1 = 1, x_2 = -1, x_3 = 1\}
\{x_1 = -1, x_2 = 1, x_3 = 1\}

H2:
\{x_1 = -1, x_2 = -1, x_3 = -1\}
\{x_1 = -1, x_2 = -1, x_3 = 1\}
\{x_1 = -1, x_2 = 1, x_3 = -1\}
\{x_1 = 1, x_2 = -1, x_3 = -1\}

thus:
d_{H1} = 1 , d_{H2} = 1
and \{H_1 \cup H_2\} can shatter all 3 point


d_{H_1 \cup H_2} = 3:)

Quote:

Originally Posted by jforbes (Post 10636)
Bad news: the problem requires that each hypothesis set has a finite positive VC dimension, so this particular example does not guarantee that the maximum is greater than the sum of the VC dimension for the case we're being asked about.

I'm still on the fence for this question - my first attempts at constructing an example similar to this where each hypothesis set had a VC dimension of 1 failed to shatter 3 points, but I'm not 100% sure yet that I can't get it to work.


nkatz 04-29-2013 08:50 AM

Re: Q10 higher bound
 
Here is another example for the upper bound:

* H1 takes any 1 point and sets it to + or -, with the other points--if any--getting the opposite sign
* H2 sets all to + or all to -

H1 can only shatter 1 point, and H2 can only shatter 1 point
Their union however can shatter 3 points

MindExodus 04-29-2013 05:44 PM

Re: Q10 higher bound
 
And this method can be generalizes!!

for H_1, H_2 with VC dimension d_1, d_2

H1: hypothesis over \{x_1, x_2, ... , x_{d1+d2-1}\}
mapping these point to cases that at most d_1 points to +1 and other points to -1

H2: hypothesis over \{x_1, x_2, ... , x_{d1+d2-1}\}
mapping these point to cases that at most d_2 points to -1 and other points to +1
so there is at least d_1 + 1 +1 and other points to -1


easy to prove that H_1 \cap H_2 = \{\}
since card(H_1) + card(H_2) = 2^{d_1+d_2+1}
so \{H_1 \cup H_2\} can shatter all d_1 + d_2 + 1 point:D

By proper extension with same idea, it can reach higher bound in k points cases

Quote:

Originally Posted by MindExodus (Post 10640)
sry about didn't notice that requirement
but the key to this problem is the same:

let me just put another example:

just think of H1 and H2 on \{x_1, x_2, x_3\}
H1:
\{x_1 = 1, x_2 = 1, x_3 = 1\}
\{x_1 = 1, x_2 = 1, x_3 = -1\}
\{x_1 = 1, x_2 = -1, x_3 = 1\}
\{x_1 = -1, x_2 = 1, x_3 = 1\}

H2:
\{x_1 = -1, x_2 = -1, x_3 = -1\}
\{x_1 = -1, x_2 = -1, x_3 = 1\}
\{x_1 = -1, x_2 = 1, x_3 = -1\}
\{x_1 = 1, x_2 = -1, x_3 = -1\}

thus:
d_{H1} = 1 , d_{H2} = 1
and \{H_1 \cup H_2\} can shatter all 3 point


d_{H_1 \cup H_2} = 3:)


Elroch 04-29-2013 06:04 PM

Re: Q10 higher bound
 
Simpler?

H_1=\{\}, H_2=\{a\}

:)

jforbes 04-29-2013 08:19 PM

Re: Q10 higher bound
 
Quote:

Originally Posted by MindExodus (Post 10640)
thus:
d_{H1} = 1 , d_{H2} = 1
and \{H_1 \cup H_2\} can shatter all 3 point


d_{H_1 \cup H_2} = 3:)

MindExodus - very nice; I'm thoroughly convinced.



Quote:

Originally Posted by nkatz
Here is another example for the upper bound:

* H1 takes any 1 point and sets it to + or -, with the other points--if any--getting the opposite sign
* H2 sets all to + or all to -

H1 can only shatter 1 point, and H2 can only shatter 1 point
Their union however can shatter 3 points

nkatz, your example has me slightly puzzled. The only way H1 can fail to shatter 2 points is if there are exactly two points:
H1 contains +- and -+, but not -- or ++
Whereas if you give it three points, it contains
+--,
-+-,
--+,
++-,
+-+, and
-++.
Just looking at the first two points, now H1 contains all 4 combinations. How can this be if its VC dimension is only 1? I think it must be the case that d_{VC}(H_1) > 1, but I'd be unsurprised if I were missing something.

Elroch 04-30-2013 04:13 AM

Re: Q10 higher bound
 
A useful little construction is this.

If you have two underlying disjoint sets of points A and B, and hypothesis sets \mathcal H_A \subset 2^A and \mathcal H_B \subset 2^B (where 2^S is the set of subsets of S), then you can can form a new hypothesis set \mathcal H_{ A X B} \subset 2^{A \cup B} with one hypothesis for each pair of hypotheses in \mathcal H_A and \mathcal H_B (in the obvious way). There is a simple relationship between the VC-dimensions of these three hypothesis sets, which can be used in examples.

Elroch 04-30-2013 04:20 AM

Re: Q10 higher bound
 
Quote:

Originally Posted by jforbes (Post 10659)
MindExodus - very nice; I'm thoroughly convinced.





nkatz, your example has me slightly puzzled. The only way H1 can fail to shatter 2 points is if there are exactly two points:
H1 contains +- and -+, but not -- or ++
Whereas if you give it three points, it contains
+--,
-+-,
--+,
++-,
+-+, and
-++.
Just looking at the first two points, now H1 contains all 4 combinations. How can this be if its VC dimension is only 1? I think it must be the case that d_{VC}(H_1) > 1, but I'd be unsurprised if I were missing something.

I presume he meant it was only one specific point that was being referred to, so \mathcal H_1 is +-- and -++, nothing more

Michael Reach 04-30-2013 07:14 AM

Re: Q10 higher bound
 
Good work on this problem by several of you guys!

jforbes 04-30-2013 01:05 PM

Re: Q10 higher bound
 
Quote:

Originally Posted by Elroch (Post 10669)
I presume he meant it was only one specific point that was being referred to, so \mathcal H_1 is +-- and -++, nothing more

Wouldn't it then be the case that the union of the hypothesis sets could only shatter two points?

I'm still curious about what went wrong if H1 does in fact include --+, -+-, ... Is this sort of hypothesis set disallowed for some reason?

TTHotShot 04-30-2013 02:10 PM

Re: Q10 higher bound
 
Quote:

Originally Posted by OlivierB (Post 10620)
m_{H}(N)=m_{H'_1}(N)+m_{H'_2}(N)

Great work everyone - this question had me scratching my head until I found this post. The one thing I don't understand is the above statment. Am I missing something? It seems like it should be an inequality.

Elroch 04-30-2013 02:21 PM

Re: Q10 higher bound
 
Quote:

Originally Posted by jforbes (Post 10673)
Wouldn't it then be the case that the union of the hypothesis sets could only shatter two points?

I'm still curious about what went wrong if H1 does in fact include --+, -+-, ... Is this sort of hypothesis set disallowed for some reason?

There are no rules constraining the specific hypotheses in a hypothesis set. It can be any subset of the power set of the underlying set of points.

jforbes 04-30-2013 06:31 PM

Re: Q10 higher bound
 
Quote:

Originally Posted by Elroch (Post 10677)
There are no rules constraining the specific hypotheses in a hypothesis set. It can be any subset of the power set of the underlying set of points.

OK, so if we assume that H is such that it "takes any 1 point and sets it to + or -, with the other points--if any--getting the opposite sign" as proposed by nkatz, then:
for N=1, H={+,-} and shatters this 1 point
for N=2, H={+-,-+} and does not shatter these two points.
for N=3, H={++-, +-+, -++, --+, -+-, +--} and does shatter two points.

What is the VC dimension of this hypothesis set? From the N=2 case, I might conclude that it's 1 (it fails to shatter two points), but from the N=3 case I might conclude it's 2 (it shatters 2 of the points within this set of points). Where am I going wrong?

marek 04-30-2013 08:56 PM

Re: Q10 higher bound
 
Since the homework set was now due, I was wondering if the course staff could weigh in on this question. Did our discussion adequately cover this question or were there any nuances that we missed?

yaser 04-30-2013 11:04 PM

Re: Q10 higher bound
 
Quote:

Originally Posted by marek (Post 10679)
Since the homework set was now due, I was wondering if the course staff could weigh in on this question. Did our discussion adequately cover this question or were there any nuances that we missed?

The discussion has been great. I won't talk about the answer which you all know by now as I want to keep this thread rather than archive it with *ANSWER* threads.

Elroch 05-01-2013 05:41 AM

Re: Q10 higher bound
 
Quote:

Originally Posted by jforbes (Post 10678)
OK, so if we assume that H is such that it "takes any 1 point and sets it to + or -, with the other points--if any--getting the opposite sign" as proposed by nkatz, then:
for N=1, H={+,-} and shatters this 1 point
for N=2, H={+-,-+} and does not shatter these two points.
for N=3, H={++-, +-+, -++, --+, -+-, +--} and does shatter two points.

What is the VC dimension of this hypothesis set? From the N=2 case, I might conclude that it's 1 (it fails to shatter two points), but from the N=3 case I might conclude it's 2 (it shatters 2 of the points within this set of points). Where am I going wrong?

Unquestionably, the VC-dimensions of the first two hypothesis sets are 1, and that of the third hypothesis set is 2, by the definition.

Elroch 05-01-2013 09:24 AM

Re: Q10 higher bound
 
There is a proviso to my glib claim above that absolutely any subset can be a hypothesis. While this is true when the underlying set is finite or countable, when it is uncountably infinite I believe it only makes sense to have hypotheses which are measurable sets, according to the probability distribution associated with samples.

[Almost anything that makes any sense is likely to satisfy this, but it is not difficult to construct pathological examples even when the sample space is as simple as the unit interval with the uniform distribution].

OlivierB 05-01-2013 05:01 PM

Re: Q10 higher bound
 
Quote:

Originally Posted by marek (Post 10626)
I am still confused on how to piece everything together. I like your much cleaner version, but I do have one comment. Having disjoint hypothesis sets does not necessarily mean that the set of dichotomies they create will also be disjoint.

For example, let H1 and H2 be the positive and negative 1d rays, respectively. These two hypothesis sets are disjoint. However, given any data set they can both create the dichotomies of all +1 or all -1. They won't have much overlap beyond that (and maybe THAT is the point), but they won't be entirely disjoint as far as our inequalities are concerned.

Quote:

Originally Posted by TTHotShot (Post 10676)
Great work everyone - this question had me scratching my head until I found this post. The one thing I don't understand is the above statment. Am I missing something? It seems like it should be an inequality.

@ marek, TTHotShot: Well, I agree with your remark, I was too quick: I have not explored the link between a hypothesis set and the dichotomies it generates on a set of points well enough. As a consequence the assertion that m_{H}(N)=m_{H'_1}(N)+m_{H'_2}(N) in general is wrong in general. Instead, it should be an inequality (≤), and probably quite a loose one, since there can be an overlap in the dichotomies generated by H_1 and H_2. Additionally, the definition of m_{H}(N) involves the determination of a maximum, and the independent search of 2 max on the RHS is a lot less constraining than one max on the LHS.

marek 05-01-2013 05:35 PM

Re: Q10 higher bound
 
Quote:

Originally Posted by OlivierB (Post 10687)
@ marek, TTHotShot: Well, I agree with your remark, I was too quick: I have not explored the link between a hypothesis set and the dichotomies it generates on a set of points well enough. As a consequence the assertion that m_{H}(N)=m_{H'_1}(N)+m_{H'_2}(N) in general is wrong in general. Instead, it should be an inequality (≤), and probably quite a loose one, since there can be an overlap in the dichotomies generated by H_1 and H_2. Additionally, the definition of m_{H}(N) involves the determination of a maximum, and the independent search of 2 max on the RHS is a lot less constraining than one max on the LHS.

I agree that the inequality here feels rather loose, as that is what troubled me initially (since it feels like we can do better). However once we were able to generate examples of hypothesis sets that did result in disjoint dichotomies, and since those examples could readily be extended to broad families of such examples, that seals the deal. We are proving a general case argument. So even though many times we should expect to be well under the bound, those special examples that reach equality prevent us from making the bound any tighter.

khohi 03-04-2016 07:15 AM

Re: Q10 higher bound
 
Great :) job

علاج تساقط الشعر


All times are GMT -7. The time now is 08:36 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.