View Single Post
Old 04-28-2013, 02:20 PM
OlivierB OlivierB is offline
Join Date: Apr 2013
Location: Paris
Posts: 16
Default Re: Q10 higher bound

@marek, Thanks for your contribution.
Let me try and rephrase, not adding much.

For a hypothesis set H with VD dimension d, we have
m_{H}(N)\le B(N, d+1)=\sum_{i=0}^d \left(\begin{array}{c}N\\ i\end{array}\right)

The second equality is the result of 2 inequalities in opposite direction. In class, we saw '≤' and '≥' is the subject of problem 2.4 in the book.

Let 2 hypothesis sets H_{1} and H_{2} with VC dimensions respectively d_{1} and d_{2}.

We can construct two disjoint H'_{1} and H'_{2} putting the intersection of both with one of them only.
H'_{2}=H_{2}-\bigcap(H_{1}, H_{2})

The VC dimension of H'_{1} is d_{1}
The VC dimension of H'_{2} is d'_{2}\le d_{2}

Let H be the union of these hypothesis sets:
H=\bigcup(H_{1}, H_{2})=\bigcup(H'_{1}, H'_{2})
Let N=d_{1}+d_{2}+1

Now in order to try and determine the VC dimension of H, let us compute a higher bound of m_{H}(N):

m_{H}(N)=m_{H'_1}(N)+m_{H'_2}(N)\le\sum_{i=0}^{d_1}\left(\begin{array}{c}N\\ i\end{array}\right)+\sum_{i=0}^{d'_2}\left(\begin{array}{c}N\\ i\end{array}\right)\le\sum_{i=0}^{d_1}\left(\begin{array}{c}N\\ i\end{array}\right)+\sum_{i=0}^{d_2}\left(\begin{array}{c}N\\ i\end{array}\right)

which we can simplify:

\sum_{i=0}^{d_1}\left(\begin{array}{c}N\\ i\end{array}\right)+\sum_{i=0}^{d_2}\left(\begin{array}{c}N\\ i\end{array}\right)=\sum_{i=0}^{d_1}\left(\begin{array}{c}N\\ i\end{array}\right)+\sum_{i=N-d_2}^{N}\left(\begin{array}{c}N\\ i\end{array}\right)=\sum_{i=0}^{d_1}\left(\begin{array}{c}N\\ i\end{array}\right)+\sum_{i=d_1+1}^{N}\left(\begin{array}{c}N\\ i\end{array}\right)


For m_{H}(N)=2^N to be true i.e. for H to shatter N, all inequalities must be equalities.
In other words, we must have:
1/ The growth functions of H_{1} and H_{2} must be exactly B(N, d_1+1) and B(N, d_2+1)
2/ Removing the intersection of H_{1} and H_{2} from H_{2} does not decrease the VC dimension of H_{2}. If the intersection is empty then this condition holds.

If these 2 requirements are not contradictory (which seems plausible but I cannot prove - neither can I visualize with an example), then the VC dimension of H is at least d_1+d_2+1.

Now unless there is a mistake in the reasoning, the question is: Can these requirements be met ? Ideally via an example, because beyond the abstract equations, I would really like to 'visualize' a case where these inequalities are 'saturated'.
Reply With Quote