LFD Book Forum  

Go Back   LFD Book Forum > Book Feedback - Learning From Data > Chapter 2 - Training versus Testing

Reply
 
Thread Tools Display Modes
  #1  
Old 04-28-2012, 07:33 PM
dudefromdayton dudefromdayton is offline
Invited Guest
 
Join Date: Apr 2012
Posts: 140
Default consistency issue on page 65

Example 2.8, the target function is sin(pi*x). But both target function graph labels in the second figure show sin(x) instead. Someone in the know should see that the graphed function, its label, and the target function coincide.
Reply With Quote
  #2  
Old 04-28-2012, 07:40 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,476
Default Re: consistency issue on page 65

Quote:
Originally Posted by dudefromdayton View Post
Example 2.8, the target function is sin(pi*x). But both target function graph labels in the second figure show sin(x) instead. Someone in the know should see that the graphed function, its label, and the target function coincide.
Thank you for noticing it. Indeed, it should be \sin(\pi x).
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #3  
Old 04-28-2012, 08:32 PM
dudefromdayton dudefromdayton is offline
Invited Guest
 
Join Date: Apr 2012
Posts: 140
Default Re: consistency issue on page 65

On the same page, I've been able to confirm the biases stated for H0 and H1, as well as the variance for H0. But for the variance of 1.69 for H1, I am obtaining 2.44 instead.

I have this problem whether I calculate the variance directly, or I calculate the out-of-sample error and subtract the bias.

It would be reassuring if I could show that my 2.44 figure is wrong, but as yet I have not succeeded.
Reply With Quote
  #4  
Old 04-28-2012, 10:36 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,476
Default Re: consistency issue on page 65

Quote:
Originally Posted by dudefromdayton View Post
On the same page, I've been able to confirm the biases stated for H0 and H1, as well as the variance for H0. But for the variance of 1.69 for H1, I am obtaining 2.44 instead.

I have this problem whether I calculate the variance directly, or I calculate the out-of-sample error and subtract the bias.

It would be reassuring if I could show that my 2.44 figure is wrong, but as yet I have not succeeded.
This is curious. To calculate E_{\rm out}, you generate the two points x_1 and x_2 uniformly and independently, then connect them with a line. Since y_1=\sin(\pi x_1) and y_2=\sin(\pi x_2), you get the equation for the line (tangent to the sine if the two points coincide). You then integrate the difference squared between the line and the sine with respect to x (times the probability density which is {1\over 2}) to get E_{\rm out} for this data set, and double-integrate that with respect to x_1 and x_2 (times the probability density which is {1\over 2} for each variable) to get the expected E_{\rm out}.

Is this what you have done to come up with a numerical answer equal to E_{\rm out}=2.44+0.21?

Hint: It is much easier to Monte-Carlo.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #5  
Old 04-29-2012, 05:54 AM
dudefromdayton dudefromdayton is offline
Invited Guest
 
Join Date: Apr 2012
Posts: 140
Default Re: consistency issue on page 65

My ETA having a closer look at this is 1-2 days. Schedule. I'm definitely interested in resolving this matter, and I'm happy to participate.

Update: wrapping up some computations; I'm testing an idea as to what might have happened. But the test is not quick.
Reply With Quote
  #6  
Old 04-29-2012, 03:50 PM
dudefromdayton dudefromdayton is offline
Invited Guest
 
Join Date: Apr 2012
Posts: 140
Default Re: consistency issue on page 65

Alas, I got bit by that old dog, numerical instability. When I run 1,000 hypotheses, I get the much higher figure I reported. 10,000 brought me a lot closer, and 100,000 has me at 1.70 variance (and presumably moving in the direction of 1.69).

These tests take a lot longer than they might, because I solve the line equation with w-a-a-y too many CPU cycles. The upside is that the same code works with several other hypothesis sets to meet a short-term need.

I feel in general like I've faced a couple of numerical stability challenges in very recent time, and it's beneficial when students have to deal with it. I imagine there might be more ahead.

Last edited by dudefromdayton; 04-29-2012 at 03:52 PM. Reason: concluding ideas
Reply With Quote
  #7  
Old 07-14-2014, 08:45 AM
BojanVujatovic BojanVujatovic is offline
Member
 
Join Date: Jan 2013
Posts: 13
Default Re: consistency issue on page 65

It is somewhat late but I'd like to give analytic solution for E_{out}, bias and var and the verification of values here. Firstly,

\begin{aligned}
\overline{g}(x) &= E_{\mathcal{D}}[g^{(\mathcal{D})}(x)] =  E_{x_1, x_2}[g^{(x_1, x_2)}(x)]=  E_{x_1, x_2}[a^{(x_1, x_2)}x +b^{(x_1, x_2)}] \\
&= E_{x_1, x_2}[a^{(x_1, x_2)}]x + E_{x_1, x_2}[b^{(x_1, x_2)}] = \overline{a}x + \overline{b}
\end{aligned}

a^{(x_1, x_2)} and b^{(x_1, x_2)} are parameters we get after minimising squared error function on some x_1, x_2:

\begin{aligned}
\left(a^{(x_1, x_2)}, b^{(x_1, x_2)}\right) =  \underset{a, b \in \mathbb{R}}{\text{argmin }}E_{in}(a, b)=  \underset{a, b \in \mathbb{R}}{\text{argmin }}\dfrac{1}{2} \displaystyle\sum_{i=1}^2 \left(ax_i + b - \sin \pi x_i \right)^2
\end{aligned}

and we can get them by solving the following system of equations (condition for extreme value of function):

\begin{cases}
\dfrac{\partial}{\partial a} E_{in}(a, b) = 0 \\ \\
\dfrac{\partial}{\partial b} E_{in}(a, b) = 0
\end{cases}

The solution is: \left(a^{(x_1, x_2)}, b^{(x_1, x_2)}\right) = \left(\dfrac{\sin \pi x_1 - sin \pi x_2}{x_1-x_2}, \dfrac{-x_2\sin \pi x_1 + x_1\sin \pi x_2}{x_1-x_2} \right)


Now,

\begin{aligned}
\overline{a} = E_{x_1, x_2}[a^{(x_1, x_2)}] &= \int\limits_{-1}^{-1} \int\limits_{-1}^{-1}  a^{(x_1, x_2)} P[x_1][x_2] \,\text{d}x_1 \,\text{d}x_2 \\
&= \int\limits_{-1}^{-1} \int\limits_{-1}^{-1}  \dfrac{\sin \pi x_1 - sin \pi x_2}{x_1-x_2} \cdot \dfrac{1}{2} \cdot \dfrac{1}{2} \cdot\,\text{d}x_1 \,\text{d}x_2 \approx  0.7759
\end{aligned}

\begin{aligned}
\overline{b} = E_{x_1, x_2}[b^{(x_1, x_2)}] &= \int\limits_{-1}^{-1} \int\limits_{-1}^{-1}  b^{(x_1, x_2)} P[x_1][x_2] \,\text{d}x_1 \,\text{d}x_2 \\
&=  \int\limits_{-1}^{-1} \int\limits_{-1}^{-1} \dfrac{-x_2\sin \pi x_1 + x_1\sin \pi x_2}{x_1-x_2} \cdot \dfrac{1}{2} \cdot \dfrac{1}{2} \cdot\,\text{d}x_1 \,\text{d}x_2 =  0
\end{aligned}

So, \overline{g}(x) =  \overline{a}x + \overline{b}\approx 0.7759x

Now we can calculate all the terms:

\begin{aligned}
\text{bias} &= E_x\left[\left(\overline{g}(x) - f(x) \right)^2 \right] = \int\limits_{-1}^{-1}\left(\overline{a}x + \overline{b} - \sin \pi x \right)^2 P[x] \,\text{d}x \\
&\approx \int\limits_{-1}^{-1}\left(0.7759x - \sin \pi x \right)^2 \cdot \dfrac{1}{2} \cdot \,\text{d}x \approx 0.2069
\end{aligned}


\begin{aligned}
\text{var} &= E_x\left[E_{\mathcal{D}}\left[\left(\overline{g}(x) - g^{\mathcal{D}}(x) \right)^2 \right]\right] = E_{x, x_1, x_2}\left[\left(\overline{a}x + \overline{b} -a^{(x_1, x_2)}x +b^{(x_1, x_2)} \right)^2 \right] 
\end{aligned}
\begin{aligned}
 &= \int\limits_{-1}^{-1}\int\limits_{-1}^{-1}\int\limits_{-1}^{-1}\left(\overline{a}x + \overline{b} -a^{(x_1, x_2)}x -b^{(x_1, x_2)} \right)^2 P[x]P[x_1]P[x_2] \,\text{d}x\,\text{d}x_1\,\text{d}x_2
\end{aligned}
\begin{aligned}
 &\approx \int\limits_{-1}^{-1}\int\limits_{-1}^{-1}\int\limits_{-1}^{-1}\left(0.7759x -\dfrac{\sin \pi x_1 - sin \pi x_2}{x_1-x_2}x -\dfrac{-x_2\sin \pi x_1 + x_1\sin \pi x_2}{x_1-x_2} \right)^2 \dfrac{1}{2^3}\,\text{d}x\,\text{d}x_1\,\text{d}x_2 \\
&\approx 1.676
\end{aligned}



\begin{aligned}
E_{\mathcal{D}}[E_{out}(g^{(\mathcal{D})})] &= E_x\left[E_{\mathcal{D}}\left[\left(\overline{g}(x) - f(x) \right)^2 \right]\right] = E_{x, x_1, x_2}\left[\left(\overline{a}x + \overline{b} -f(x) \right)^2 \right] 
\end{aligned}
\begin{aligned}
 &= \int\limits_{-1}^{-1}\int\limits_{-1}^{-1}\int\limits_{-1}^{-1}\left(\overline{a}x + \overline{b} -\sin \pi x \right)^2 P[x]P[x_1]P[x_2] \,\text{d}x\,\text{d}x_1\,\text{d}x_2
\end{aligned}
\begin{aligned}
 &\approx \int\limits_{-1}^{-1}\int\limits_{-1}^{-1}\int\limits_{-1}^{-1}\left(0.7759x -\sin \pi x \right)^2 \dfrac{1}{2^3}\,\text{d}x\,\text{d}x_1\,\text{d}x_2 \approx 1.883
\end{aligned}

So, we see that the following holds:

\begin{aligned}
E_{\mathcal{D}}[E_{out}(g^{(\mathcal{D})})] &= \text{bias} + \text{var}\\
1.883 &\approx 0.2067 + 1.676
\end{aligned}
Reply With Quote
  #8  
Old 07-15-2014, 07:42 AM
magdon's Avatar
magdon magdon is offline
RPI
 
Join Date: Aug 2009
Location: Troy, NY, USA.
Posts: 595
Default Re: consistency issue on page 65

Thanks for the detailed analysis
__________________
Have faith in probability
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 12:48 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.