LFD Book Forum SGD Movie rating example
 Register FAQ Calendar Mark Forums Read

#1
07-28-2012, 08:31 AM
 invis Senior Member Join Date: Jul 2012 Posts: 50
SGD Movie rating example

I dont understand example from start of the Lecture 10 about movie rating.
Maybe some one can tell me what we suppose to do with errors

How to minimise all errors and get the final hypothesis ?

Tell me where I'm wrong, please:

1)All we need to do is:

2) Will it be the final hypothesis ?

3) Compute gradient error like this (using nu=0.1) ?

where w(0) = our error.

P.S. All what I am trying to do is understand this example from start to exactly end when we have hypothesis function.
#2
07-28-2012, 01:08 PM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478
Re: SGD Movie rating example

Quote:
 Originally Posted by invis All what I am trying to do is understand this example from start to exactly end when we have hypothesis function.
The hypothesis in this case is (I am putting a hat to distinguish our estimate of the rating which is the hypothesis from the actual rating which is the target). In terms of our standard notation, where the indices play role of the input , and the factors play the role of the parameters of the hypothesis set. Each set of values of these parameters corresponds to a hypothesis , and the final set of values when SGD terminates corresponds to the final hypothesis .

Now, a step in SGD modifies according to the gradient of the error on one example ( in the slide you quote in your post). You get the partial derivative of with respect to each and each , and move along the negative of the gradient.

For each example, there are only of these parameters for which the gradient is non-zero (the rest of the parameters are not involved in so the partial derivative with respect to them is zero). When you go to another example, it will involve other parameters so by the time you have gone through all examples, all the parameters will have been involved.
__________________
Where everyone thinks alike, no one thinks very much
#3
07-29-2012, 01:41 AM
 invis Senior Member Join Date: Jul 2012 Posts: 50
Re: SGD Movie rating example

But I still dont understand SGD steps. How to compute gradient on 1 example ? How to get the partial derivative of if and is unknown ?
#4
07-29-2012, 12:54 PM
 invis Senior Member Join Date: Jul 2012 Posts: 50
Re: SGD Movie rating example

Should I just initialize all U and V to 0 and then start to compute error, then gradient and so on ?
#5
07-29-2012, 02:05 PM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478
Re: SGD Movie rating example

Quote:
 Originally Posted by invis Should I just initialize all U and V to 0 and then start to compute error, then gradient and so on ?
In general, Initializing to small random numbers avoids symmetry problems.
__________________
Where everyone thinks alike, no one thinks very much
#6
07-29-2012, 02:27 PM
 invis Senior Member Join Date: Jul 2012 Posts: 50
Re: SGD Movie rating example

Ok, so I initialize all U's and V's to random numbers between 0 and 1 for example. The I compute first error and get 8.76 then how to compute gradient ? Just cant catch the full algorithm.
#7
07-29-2012, 02:48 PM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478
Re: SGD Movie rating example

Quote:
 Originally Posted by invis Ok, so I initialize all U's and V's to random numbers between 0 and 1 for example. The I compute first error and get 8.76 then how to compute gradient ? Just cant catch the full algorithm.
All you need is compute the partial derivatives in order to get the gradient. Just think of the factors as variables that you are differentiating with respect to.
__________________
Where everyone thinks alike, no one thinks very much
#8
07-29-2012, 10:50 PM
 invis Senior Member Join Date: Jul 2012 Posts: 50
Re: SGD Movie rating example

Exuse me for annoying please, but to be sure that I understand you right:

So after computing . We are doing this steps:

1)
2)
3) repeat computing and 1-2 steps for all data that we have

Looks strange, is'nt it ?
#9
07-29-2012, 11:27 PM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478
Re: SGD Movie rating example

Quote:
 Originally Posted by invis
is a vector of partial derivatives, and the formula for each partial derivative is much simpler than the above formula. May I suggest that you refresh the subject of partial derivatives and vectors and revisit this question?
__________________
Where everyone thinks alike, no one thinks very much
#10
07-30-2012, 09:47 AM
 invis Senior Member Join Date: Jul 2012 Posts: 50
Re: SGD Movie rating example

Is it right:

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 12:05 AM.