LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Chapter 2 - Training versus Testing (http://book.caltech.edu/bookforum/forumdisplay.php?f=109)
-   -   consistency issue on page 65 (http://book.caltech.edu/bookforum/showthread.php?t=408)

dudefromdayton 04-28-2012 08:33 PM

consistency issue on page 65
 
Example 2.8, the target function is sin(pi*x). But both target function graph labels in the second figure show sin(x) instead. Someone in the know should see that the graphed function, its label, and the target function coincide.

yaser 04-28-2012 08:40 PM

Re: consistency issue on page 65
 
Quote:

Originally Posted by dudefromdayton (Post 1658)
Example 2.8, the target function is sin(pi*x). But both target function graph labels in the second figure show sin(x) instead. Someone in the know should see that the graphed function, its label, and the target function coincide.

Thank you for noticing it. Indeed, it should be \sin(\pi x).

dudefromdayton 04-28-2012 09:32 PM

Re: consistency issue on page 65
 
On the same page, I've been able to confirm the biases stated for H0 and H1, as well as the variance for H0. But for the variance of 1.69 for H1, I am obtaining 2.44 instead.

I have this problem whether I calculate the variance directly, or I calculate the out-of-sample error and subtract the bias.

It would be reassuring if I could show that my 2.44 figure is wrong, but as yet I have not succeeded.

yaser 04-28-2012 11:36 PM

Re: consistency issue on page 65
 
Quote:

Originally Posted by dudefromdayton (Post 1662)
On the same page, I've been able to confirm the biases stated for H0 and H1, as well as the variance for H0. But for the variance of 1.69 for H1, I am obtaining 2.44 instead.

I have this problem whether I calculate the variance directly, or I calculate the out-of-sample error and subtract the bias.

It would be reassuring if I could show that my 2.44 figure is wrong, but as yet I have not succeeded.

This is curious. To calculate E_{\rm out}, you generate the two points x_1 and x_2 uniformly and independently, then connect them with a line. Since y_1=\sin(\pi x_1) and y_2=\sin(\pi x_2), you get the equation for the line (tangent to the sine if the two points coincide). You then integrate the difference squared between the line and the sine with respect to x (times the probability density which is {1\over 2}) to get E_{\rm out} for this data set, and double-integrate that with respect to x_1 and x_2 (times the probability density which is {1\over 2} for each variable) to get the expected E_{\rm out}.

Is this what you have done to come up with a numerical answer equal to E_{\rm out}=2.44+0.21?

Hint: It is much easier to Monte-Carlo.

dudefromdayton 04-29-2012 06:54 AM

Re: consistency issue on page 65
 
My ETA having a closer look at this is 1-2 days. Schedule. I'm definitely interested in resolving this matter, and I'm happy to participate. :)

Update: wrapping up some computations; I'm testing an idea as to what might have happened. But the test is not quick.

dudefromdayton 04-29-2012 04:50 PM

Re: consistency issue on page 65
 
Alas, I got bit by that old dog, numerical instability. When I run 1,000 hypotheses, I get the much higher figure I reported. 10,000 brought me a lot closer, and 100,000 has me at 1.70 variance (and presumably moving in the direction of 1.69).

These tests take a lot longer than they might, because I solve the line equation with w-a-a-y too many CPU cycles. The upside is that the same code works with several other hypothesis sets to meet a short-term need.

I feel in general like I've faced a couple of numerical stability challenges in very recent time, and it's beneficial when students have to deal with it. I imagine there might be more ahead.

BojanVujatovic 07-14-2014 09:45 AM

Re: consistency issue on page 65
 
It is somewhat late but I'd like to give analytic solution for E_{out}, bias and var and the verification of values here. Firstly,

\begin{aligned}
\overline{g}(x) &= E_{\mathcal{D}}[g^{(\mathcal{D})}(x)] =  E_{x_1, x_2}[g^{(x_1, x_2)}(x)]=  E_{x_1, x_2}[a^{(x_1, x_2)}x +b^{(x_1, x_2)}] \\
&= E_{x_1, x_2}[a^{(x_1, x_2)}]x + E_{x_1, x_2}[b^{(x_1, x_2)}] = \overline{a}x + \overline{b}
\end{aligned}

a^{(x_1, x_2)} and b^{(x_1, x_2)} are parameters we get after minimising squared error function on some x_1, x_2:

\begin{aligned}
\left(a^{(x_1, x_2)}, b^{(x_1, x_2)}\right) =  \underset{a, b \in \mathbb{R}}{\text{argmin }}E_{in}(a, b)=  \underset{a, b \in \mathbb{R}}{\text{argmin }}\dfrac{1}{2} \displaystyle\sum_{i=1}^2 \left(ax_i + b - \sin \pi x_i \right)^2
\end{aligned}

and we can get them by solving the following system of equations (condition for extreme value of function):

\begin{cases}
\dfrac{\partial}{\partial a} E_{in}(a, b) = 0 \\ \\
\dfrac{\partial}{\partial b} E_{in}(a, b) = 0
\end{cases}

The solution is: \left(a^{(x_1, x_2)}, b^{(x_1, x_2)}\right) = \left(\dfrac{\sin \pi x_1 - sin \pi x_2}{x_1-x_2}, \dfrac{-x_2\sin \pi x_1 + x_1\sin \pi x_2}{x_1-x_2} \right)


Now,

\begin{aligned}
\overline{a} = E_{x_1, x_2}[a^{(x_1, x_2)}] &= \int\limits_{-1}^{-1} \int\limits_{-1}^{-1}  a^{(x_1, x_2)} P[x_1][x_2] \,\text{d}x_1 \,\text{d}x_2 \\
&= \int\limits_{-1}^{-1} \int\limits_{-1}^{-1}  \dfrac{\sin \pi x_1 - sin \pi x_2}{x_1-x_2} \cdot \dfrac{1}{2} \cdot \dfrac{1}{2} \cdot\,\text{d}x_1 \,\text{d}x_2 \approx  0.7759
\end{aligned}

\begin{aligned}
\overline{b} = E_{x_1, x_2}[b^{(x_1, x_2)}] &= \int\limits_{-1}^{-1} \int\limits_{-1}^{-1}  b^{(x_1, x_2)} P[x_1][x_2] \,\text{d}x_1 \,\text{d}x_2 \\
&=  \int\limits_{-1}^{-1} \int\limits_{-1}^{-1} \dfrac{-x_2\sin \pi x_1 + x_1\sin \pi x_2}{x_1-x_2} \cdot \dfrac{1}{2} \cdot \dfrac{1}{2} \cdot\,\text{d}x_1 \,\text{d}x_2 =  0
\end{aligned}

So, \overline{g}(x) =  \overline{a}x + \overline{b}\approx 0.7759x

Now we can calculate all the terms:

\begin{aligned}
\text{bias} &= E_x\left[\left(\overline{g}(x) - f(x) \right)^2 \right] = \int\limits_{-1}^{-1}\left(\overline{a}x + \overline{b} - \sin \pi x \right)^2 P[x] \,\text{d}x \\
&\approx \int\limits_{-1}^{-1}\left(0.7759x - \sin \pi x \right)^2 \cdot \dfrac{1}{2} \cdot \,\text{d}x \approx 0.2069
\end{aligned}


\begin{aligned}
\text{var} &= E_x\left[E_{\mathcal{D}}\left[\left(\overline{g}(x) - g^{\mathcal{D}}(x) \right)^2 \right]\right] = E_{x, x_1, x_2}\left[\left(\overline{a}x + \overline{b} -a^{(x_1, x_2)}x +b^{(x_1, x_2)} \right)^2 \right] 
\end{aligned}
\begin{aligned}
 &= \int\limits_{-1}^{-1}\int\limits_{-1}^{-1}\int\limits_{-1}^{-1}\left(\overline{a}x + \overline{b} -a^{(x_1, x_2)}x -b^{(x_1, x_2)} \right)^2 P[x]P[x_1]P[x_2] \,\text{d}x\,\text{d}x_1\,\text{d}x_2
\end{aligned}
\begin{aligned}
 &\approx \int\limits_{-1}^{-1}\int\limits_{-1}^{-1}\int\limits_{-1}^{-1}\left(0.7759x -\dfrac{\sin \pi x_1 - sin \pi x_2}{x_1-x_2}x -\dfrac{-x_2\sin \pi x_1 + x_1\sin \pi x_2}{x_1-x_2} \right)^2 \dfrac{1}{2^3}\,\text{d}x\,\text{d}x_1\,\text{d}x_2 \\
&\approx 1.676
\end{aligned}



\begin{aligned}
E_{\mathcal{D}}[E_{out}(g^{(\mathcal{D})})] &= E_x\left[E_{\mathcal{D}}\left[\left(\overline{g}(x) - f(x) \right)^2 \right]\right] = E_{x, x_1, x_2}\left[\left(\overline{a}x + \overline{b} -f(x) \right)^2 \right] 
\end{aligned}
\begin{aligned}
 &= \int\limits_{-1}^{-1}\int\limits_{-1}^{-1}\int\limits_{-1}^{-1}\left(\overline{a}x + \overline{b} -\sin \pi x \right)^2 P[x]P[x_1]P[x_2] \,\text{d}x\,\text{d}x_1\,\text{d}x_2
\end{aligned}
\begin{aligned}
 &\approx \int\limits_{-1}^{-1}\int\limits_{-1}^{-1}\int\limits_{-1}^{-1}\left(0.7759x -\sin \pi x \right)^2 \dfrac{1}{2^3}\,\text{d}x\,\text{d}x_1\,\text{d}x_2 \approx 1.883
\end{aligned}

So, we see that the following holds:

\begin{aligned}
E_{\mathcal{D}}[E_{out}(g^{(\mathcal{D})})] &= \text{bias} + \text{var}\\
1.883 &\approx 0.2067 + 1.676
\end{aligned}

magdon 07-15-2014 08:42 AM

Re: consistency issue on page 65
 
Thanks for the detailed analysis :)


All times are GMT -7. The time now is 06:16 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.