#1




Question 10
From the results of running 1 vs 5, I met one of the possible answers, but it seems that is not correct. I can not say anything more for now... Maybe I made a mistake, but in the other questions related it seems I have the right answers.

#2




Re: Question 10
My results also ended up pointing into only one of the hypothesis. But there can be some mistake in the code  only 100% sure after submission . In fact, the results are not of the kind that would let yourself go into "relaxed" state .

#3




Re: Question 10
The same here, simulation points to one of the answers. For some reason, I am not able to take comfort in that. I also did run my simulation for different values of lambda and they all seem to point to that same answer.

#4




Re: Question 10
When i ran the simulation I am getting 2 answers that matches my result. Now I am really confused how to proceed. Can somebody help me which one I should select? I am getting these answers repeatedly. Is there some rule on how many iterations Gradient descent should be run? I see the gradient descent error keeps decreasing even after 2000 runs. It make no difference in the outcome though.

#5




Re: Question 10
If the error keeps dropping, I would keep going. I didn't use gradient descent, myself. I solved it two ways, getting the same answer with both methods. First, I noted that it could be solved by quadratic programming. Second, I fed it to a (nongradient) conjugate direction set routine that I've had lying around for years.

#6




Re: Question 10
Is it also possible to use the regularized normal equation? I'm looking at Lecture 12, Slide 11.
It seems funny to me to choose the parameters to minimize one error measure (mean square), yet evaluate {E_in} and {E_out} using another (binary classification). 
#7




Re: Question 10
Quote:
Quote:
__________________
Where everyone thinks alike, no one thinks very much 
#8




Re: Question 10
Thank you professor, let me try the analytical solution approach and see if I get a different results. I am still getting 2 right answers from the quiz.

#9




Re: Question 10
Regularised Linear regression is called ridge regression in the stats literature
you can just use whatever code you use to do linear least squares by adding dummy data( see data augmentation) in link below http://wwwstat.stanford.edu/~owen/c...larization.pdf no need for quadratic programming / stoch descent etc 
#10




Re: Question 10
Quote:
__________________
Where everyone thinks alike, no one thinks very much 
Tags 
question 10 
Thread Tools  
Display Modes  

