LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Chapter 1 - The Learning Problem (http://book.caltech.edu/bookforum/forumdisplay.php?f=108)
-   -   Exercise 1.12 - Failing to make Ein(g) small enough (http://book.caltech.edu/bookforum/showthread.php?t=4497)

PhilW 09-09-2014 06:02 PM

Exercise 1.12 - Failing to make Ein(g) small enough
 
Let's say I run my machine learning algorithm for my friend, taking care to ensure Ein(g) and Eout(g) are close enough, but I find that my Ein(g) = .5 or something terrible like that. What are my options for continuing to solve the machine learning problem? Is there any way for me to go back and change my hypothesis set without losing the theoretical guarantees that Ein(g) is close to Eout(g)?

yaser 09-09-2014 07:19 PM

Re: Exercise 1.12 - Failing to make Ein(g) small enough
 
Quote:

Originally Posted by PhilW (Post 11720)
Let's say I run my machine learning algorithm for my friend, taking care to ensure Ein(g) and Eout(g) are close enough, but I find that my Ein(g) = .5 or something terrible like that. What are my options for continuing to solve the machine learning problem? Is there any way for me to go back and change my hypothesis set without losing the theoretical guarantees that Ein(g) is close to Eout(g)?

Let us say that {\mathcal H}_1 is the hypothesis set that didn't work, and you want now to try another hypothesis set {\mathcal H}_2. The theoretical guarantees would still hold, but for the equivalent hypothesis set {\mathcal H}_1 \cup {\mathcal H}_2.

Just because this uses a "hierarchy" of hypothesis sets (in this case the hierarchy being {\mathcal H}_1 followed by {\mathcal H}_1 \cup {\mathcal H}_2 upon failure of {\mathcal H}_1, folllowed by possibly other expansions if {\mathcal H}_1 \cup {\mathcal H}_2 failed), there is in general an additional theoretical price to pay, but it is low. Look at structural risk minimization if you are further interested.

magdon 09-11-2014 04:00 AM

Re: Exercise 1.12 - Failing to make Ein(g) small enough
 
Just to elaborate a little on the last point in Yaser's answer.

Suppose your strategy is to use {\mathcal H}_1 \cup {\mathcal H}_2 only if {\mathcal H}_1 fails.

If {\mathcal H}_1 fails and you use {\mathcal H}_1 \cup {\mathcal H}_2 it is natural that you should pay the price for the VC bound implied by {\mathcal H}_1 \cup {\mathcal H}_2.

The interesting case is if {\mathcal H}_1 succeeds and you get low E_{in}. You cannot use the VC bound that applies for {\mathcal H}_1.

It is the option to use {\mathcal H}_1 \cup {\mathcal H}_2 in the event that {\mathcal H}_1 fails that complicates the matter.

This is why to get a correct theoretical bound, you must always specify your entire strategy first. The simplest strategy is to fix a hypothesis set. If it fails, it fails and you are done. If in the back of your mind you are thinking about the possibility of changing hypothesis sets if it fails, then this has to be taken into account in the theoretical analysis from the very begining, in particular, even if the first hypothesis set succeeds.

As mentioned by Yaser, one framework that is useful in analyzing such adaptive strategies is structural risk minimization.

Quote:

Originally Posted by yaser (Post 11721)
Let us say that {\mathcal H}_1 is the hypothesis set that didn't work, and you want now to try another hypothesis set {\mathcal H}_2. The theoretical guarantees would still hold, but for the equivalent hypothesis set {\mathcal H}_1 \cup {\mathcal H}_2.

Just because this uses a "hierarchy" of hypothesis sets (in this case the hierarchy being {\mathcal H}_1 followed by {\mathcal H}_1 \cup {\mathcal H}_2 upon failure of {\mathcal H}_1, folllowed by possibly other expansions if {\mathcal H}_1 \cup {\mathcal H}_2 failed), there is in general an additional theoretical price to pay, but it is low. Look at structural risk minimization if you are further interested.



All times are GMT -7. The time now is 04:17 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.