![]() |
|
#1
|
|||
|
|||
![]()
This question is similar to that in the lectures i.e.,
in the lecture H1 equals h(x) = ax + b Is this question different to the lecture in the respect we shouldn't add "b" (i.e., X0 the bias/intercept) when applying? Or should I treat the same? My confusion is because in many papers etc a bias/intercept is assumed even if not specified i.e., h(x) = ax could be considered the same as h(x) = ax + b |
#2
|
||||
|
||||
![]() Quote:
![]()
__________________
Where everyone thinks alike, no one thinks very much |
#3
|
|||
|
|||
![]()
Thanks for comfirmation, much appreciated
![]() |
#4
|
|||
|
|||
![]()
Is there a best way to minimize the mean-squared error? I am doing gradient descent with a very low learning rate (0.00001) and my solution is diverging! not converging. Is it not feasible to do gradient descent with two points when approximating a sine?
Thanks |
#5
|
|||
|
|||
![]()
Never mind, I got my solution to converge, though I do not trust my answer. Oh well.
|
#6
|
|||
|
|||
![]() Quote:
(since linear regression is basically analytical formula for minimizing mean square error). Also, you can confirm if your g_bar from simulation makes sense by calculate it directly. (calculate expectation of the hypothesis from each (x1,x2) over [-1,1] x [-1,1] ). This involves two integrals but you can plug in the expression to wolfram or mathematica. |
#7
|
|||
|
|||
![]() Quote:
and note that X has just a single column (no column of 1's).
__________________
Joe Levy [URL="https://trustmeimastatistician.wordpress.com/"]Trust me, I'm a statistician[/URL] |
![]() |
Thread Tools | |
Display Modes | |
|
|