LFD Book Forum movie ratings
 Register FAQ Calendar Mark Forums Read

#1
02-12-2013, 03:25 PM
 ilya239 Senior Member Join Date: Jul 2012 Posts: 58
movie ratings

What is the intuition behind the form of the hypothesis function for movie ratings?

I'm trying to understand why it makes sense to multiply user factor by movie factor, to get that factor's contribution to the rating. E.g. if user doesn't like horror movies and the movie has a low "horror movie" rating, multiplying these together gives a low number. Shouldn't the rating be based on the distance/difference between a user's value for a factor and a movie's value for that factor?

I understand that in a learning situation the factors do not have specific interpretations -- there is just a list of factors. Still, the motivation was clearly that there are factors (horror-ness, comedy-ness etc). So what is the motivation behind taking the product of factors instead of some form of their difference?
#2
02-12-2013, 03:32 PM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478
Re: movie ratings

Quote:
 Originally Posted by ilya239 I'm trying to understand why it makes sense to multiply user factor by movie factor, to get that factor's contribution to the rating. E.g. if user doesn't like horror movies and the movie has a low "horror movie" rating, multiplying these together gives a low number. Shouldn't the rating be based on the distance/difference between a user's value for a factor and a movie's value for that factor?
An inner product is just a model, motivated by the maximization of such a product (in the case of unit norms) when the two vectors match exactly. Your idea of a distance is another model, also with its own plausibility. The learning algorithm will adjust the values of the parameters for each model so that the error is minimum, and the only objective way of comparing the plausibility of the two models is to compare their out-of-sample performance at that point.
__________________
Where everyone thinks alike, no one thinks very much
#3
02-12-2013, 06:22 PM
 ilya239 Senior Member Join Date: Jul 2012 Posts: 58
Re: movie ratings

Ah, if each vector is normalized to unit length then this makes sense. But, there is no way to constrain the vector component values during gradient descent so that the vectors stay at unit length. Or is it that each vector is normalized every time we compute the dot product? I understand wanting the vectors to point in the same direction, but the vector magnitude seems like a distraction.

I know that model parameters needn't have a human-understandable interpretation (cf. hidden layers of neural networks), but if they do, it helps to see that the intuition makes sense
#4
02-12-2013, 07:05 PM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478
Re: movie ratings

Quote:
 Originally Posted by ilya239 Ah, if each vector is normalized to unit length then this makes sense. But, there is no way to constrain the vector component values during gradient descent so that the vectors stay at unit length. Or is it that each vector is normalized every time we compute the dot product? I understand wanting the vectors to point in the same direction, but the vector magnitude seems like a distraction.
The vectors are not normalized, at least not deliberately. The argument was only meant to motivate that the inner product has a matching aspect. However, even if we consider the magnitude to be a distraction, the learning algorithm has the opportunity to keep the magnitude fixed if that helps reduce the error value.
__________________
Where everyone thinks alike, no one thinks very much
#5
02-12-2013, 07:13 PM
 ilya239 Senior Member Join Date: Jul 2012 Posts: 58
Re: movie ratings

Got it. I guess the learning algorithm cares most about the number of parameters, and forcing normalization would only reduce that by two.

Point of learning is to not have to guess the target function or even its form, but it's hard to resist micromanaging the process

Thanks!
#6
02-12-2013, 07:17 PM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478
Re: movie ratings

Quote:
 Originally Posted by ilya239 Point of learning is to not have to guess the target function or even its form, but it's hard to resist micromanaging the process
Nicely put. Sometimes there is a compelling reason to introduce a particular functional form or constraint, but this is the exception not the rule.
__________________
Where everyone thinks alike, no one thinks very much

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 02:14 PM.