LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   The Final (http://book.caltech.edu/bookforum/forumdisplay.php?f=138)
-   -   Question 10 (http://book.caltech.edu/bookforum/showthread.php?t=1463)

 victe 09-09-2012 03:58 PM

Question 10

From the results of running 1 vs 5, I met one of the possible answers, but it seems that is not correct. I can not say anything more for now... Maybe I made a mistake, but in the other questions related it seems I have the right answers.

 TonySuarez 09-10-2012 04:46 AM

Re: Question 10

My results also ended up pointing into only one of the hypothesis. But there can be some mistake in the code -- only 100% sure after submission :). In fact, the results are not of the kind that would let yourself go into "relaxed" state :).

 MLearning 09-12-2012 10:59 AM

Re: Question 10

The same here, simulation points to one of the answers. For some reason, I am not able to take comfort in that. I also did run my simulation for different values of lambda and they all seem to point to that same answer.

 jain.anand@tcs.com 03-11-2013 12:32 PM

Re: Question 10

When i ran the simulation I am getting 2 answers that matches my result. Now I am really confused how to proceed. Can somebody help me which one I should select? I am getting these answers repeatedly. Is there some rule on how many iterations Gradient descent should be run? I see the gradient descent error keeps decreasing even after 2000 runs. It make no difference in the outcome though.

 hemphill 03-11-2013 06:02 PM

Re: Question 10

If the error keeps dropping, I would keep going. I didn't use gradient descent, myself. I solved it two ways, getting the same answer with both methods. First, I noted that it could be solved by quadratic programming. Second, I fed it to a (non-gradient) conjugate direction set routine that I've had lying around for years.

 tsweetser 03-11-2013 09:50 PM

Re: Question 10

Is it also possible to use the regularized normal equation? I'm looking at Lecture 12, Slide 11.

It seems funny to me to choose the parameters to minimize one error measure (mean square), yet evaluate {E_in} and {E_out} using another (binary classification).

 yaser 03-11-2013 10:07 PM

Re: Question 10

Quote:
 Originally Posted by tsweetser (Post 9881) Is it also possible to use the regularized normal equation? I'm looking at Lecture 12, Slide 11.
Yes, you can use any result given in the lectures.

Quote:
 It seems funny to me to choose the parameters to minimize one error measure (mean square), yet evaluate {E_in} and {E_out} using another (binary classification).
Ideally, we would work exclusively with binary classification error in this case, but because of the intractability of optimization of that error, the mean-squared error is used for the optimization part.

 jain.anand@tcs.com 03-12-2013 08:40 AM

Re: Question 10

Thank you professor, let me try the analytical solution approach and see if I get a different results. I am still getting 2 right answers from the quiz.

 SeanV 03-13-2013 03:00 AM

Re: Question 10

Regularised Linear regression is called ridge regression in the stats literature

you can just use whatever code you use to do linear least squares by adding dummy data( see data augmentation) in link below
http://www-stat.stanford.edu/~owen/c...larization.pdf

no need for quadratic programming / stoch descent etc

 yaser 03-13-2013 04:53 AM

Re: Question 10

Quote:
 Originally Posted by SeanV (Post 9905) Regularised Linear regression is called ridge regression in the stats literature you can just use whatever code you use to do linear least squares by adding dummy data( see data augmentation) in link below http://www-stat.stanford.edu/~owen/c...larization.pdf no need for quadratic programming / stoch descent etc
Correct. The solution was also given in slide 11 of Lecture 12 (regularization).

All times are GMT -7. The time now is 06:32 AM.