LFD Book Forum Q4) h(x) = ax
 User Name Remember Me? Password
 Register FAQ Calendar Mark Forums Read

 Thread Tools Display Modes
#1
08-07-2012, 03:57 AM
 itooam Senior Member Join Date: Jul 2012 Posts: 100
Q4) h(x) = ax

This question is similar to that in the lectures i.e.,

in the lecture H1 equals

h(x) = ax + b

Is this question different to the lecture in the respect we shouldn't add "b" (i.e., X0 the bias/intercept) when applying? Or should I treat the same?

My confusion is because in many papers etc a bias/intercept is assumed even if not specified i.e., h(x) = ax could be considered the same as h(x) = ax + b
#2
08-07-2012, 04:24 AM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478
Re: Q4) h(x) = ax

Quote:
 Originally Posted by itooam This question is similar to that in the lectures i.e., in the lecture H1 equals h(x) = ax + b Is this question different to the lecture in the respect we shouldn't add "b" (i.e., X0 the bias/intercept) when applying? Or should I treat the same? My confusion is because in many papers etc a bias/intercept is assumed even if not specified i.e., h(x) = ax could be considered the same as h(x) = ax + b
There is no bias/intercept in this problem, only the slope (one parameter which is ).
__________________
Where everyone thinks alike, no one thinks very much
#3
08-07-2012, 04:36 AM
 itooam Senior Member Join Date: Jul 2012 Posts: 100
Re: Q4) h(x) = ax

Thanks for comfirmation, much appreciated
#4
01-31-2013, 10:16 AM
 geekoftheweek Member Join Date: Jun 2012 Posts: 26
Re: Q4) h(x) = ax

Is there a best way to minimize the mean-squared error? I am doing gradient descent with a very low learning rate (0.00001) and my solution is diverging! not converging. Is it not feasible to do gradient descent with two points when approximating a sine?
Thanks
#5
01-31-2013, 11:09 AM
 geekoftheweek Member Join Date: Jun 2012 Posts: 26
Re: Q4) h(x) = ax

Never mind, I got my solution to converge, though I do not trust my answer. Oh well.
#6
01-31-2013, 03:34 PM
 sanbt Member Join Date: Jan 2013 Posts: 35
Re: Q4) h(x) = ax

Quote:
 Originally Posted by geekoftheweek Never mind, I got my solution to converge, though I do not trust my answer. Oh well.
You can use linear regression to calculate each hypothesis.
(since linear regression is basically analytical formula for minimizing mean square error).

Also, you can confirm if your g_bar from simulation makes sense by calculate it directly. (calculate expectation of the hypothesis from each (x1,x2) over [-1,1] x [-1,1] ). This involves two integrals but you can plug in the expression to wolfram or mathematica.
#7
02-01-2013, 06:49 AM
 melipone Senior Member Join Date: Jan 2013 Posts: 72
Re: Q4) h(x) = ax

I thought it would simply be (y1/x1 + y2/x2)/2 to find an a that minimizes the mean square error on two points, no?
#8
02-01-2013, 10:36 AM
 Anne Paulson Senior Member Join Date: Jan 2013 Location: Silicon Valley Posts: 52
Re: Q4) h(x) = ax

So, in this procedure we:

Pick two points;
Find the best slope for those two points, the one that minimizes the squared error for those two points;
Do this N times and average all the s

Rather than:

Pick two points;
Calculate the squared error for those two points as a function of ;
Do this N times, then find the that minimizes the sum of all of the squared errors, as we do with linear regression

Are we doing the first thing here or the second thing? Either way there's a simple analytic solution, but I'm not sure which procedure we're doing.
#9
02-01-2013, 11:19 AM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478
Re: Q4) h(x) = ax

Quote:
 Originally Posted by Anne Paulson So, in this procedure we: Pick two points; Find the best slope for those two points, the one that minimizes the squared error for those two points; Do this N times and average all the s Rather than: Pick two points; Calculate the squared error for those two points as a function of ; Do this N times, then find the that minimizes the sum of all of the squared errors, as we do with linear regression Are we doing the first thing here or the second thing? Either way there's a simple analytic solution, but I'm not sure which procedure we're doing.
The first method estimates for the average hypothesis (which takes into consideration only two points at a time). The second method estimates for the best approximation of the target function (which takes into consideration all the points in the input space at once).
__________________
Where everyone thinks alike, no one thinks very much
#10
02-01-2013, 11:28 AM
 Anne Paulson Senior Member Join Date: Jan 2013 Location: Silicon Valley Posts: 52
Re: Q4) h(x) = ax

OK, and then the average value of *is* the expected value of .

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 09:33 PM.

 Contact Us - LFD Book - Top