 LFD Book Forum How to calculate VC dimension for matrix factorization (Netflix-like) tasks
 User Name Remember Me? Password
 FAQ Calendar Mark Forums Read

 Thread Tools Display Modes
#1
 zzzzz Junior Member Join Date: Apr 2012 Posts: 4 How to calculate VC dimension for matrix factorization (Netflix-like) tasks

Let's say we have some users (u) and some movies (m), and some ratings. The variable we control is d, a number of factors for each movie/user.

We would like to know how many data points we need according to VC theory. This number probably depends on d, so in other words, what d we can afford given the data points we have.

So, how do we go about calculating VC dimension here? Is it (u + m) * d?
#2 magdon RPI Join Date: Aug 2009 Location: Troy, NY, USA. Posts: 595 Re: How to calculate VC dimension for matrix factorization (Netflix-like) tasks

The VC dimension is a measure that characterises the complexity of a "set system" - a set of sets. A hypothesis set consisting of binary classification functions directly corresponds to a set system - a hypothesis corresponds to the set of points on which . The VC dimension measures the diversity of sets in your set system and in the context of binary classification, it has an implication on the generalization ability of the final hypothesis obtained by selecting a "set" that best fits the data.

Now, there is no direct link between the problem of matrix factorization and set systems, so the VC dimension is not yet well defined. So if your task is to observe data and obtain a low rank representation of the user-movie ratings matrix, and you want to know how good your method for factorization is, i.e. how well is it able to recover the true ratings matrix (a form of generalization), there is no immediate way to use the concept of the VC dimension.

We can use the VC dimension if you can formulate a binary classification problem. It seems that when you say "matrix factorization" you are not intersted in the complexity of the matrix factorization problem; you are intersted in the tool "matrix factorization" as a way to predict how a user will rate a movie. To make this problem simpler, say you are interested in whether the user will rate 4 or higher versus lower than 4. We have now constructed a binary classification problem. The next step is to specify the input. In this case, the input is a huge vector corresponding to a user and containing an entry for every movie (the entry is 0 if the movie was not rated by the user and the actual rating ( ) if it was rated); the prediction task is to populate every 0-entry with +1 or -1 indicating whether the user will rate that movie or not.

Now we must define your hypothesis set, and this is where matrix factorization comes in: You are going to use matrix factorization to compute (say) a set of "movie factors". Then, you will summarize the user by constructing the "projection" onto these factors and then constructing a linear model based on these factors: The transformation represents the process of projecting onto the selected factors.

Here is a very important poiont: if the transformation was fixed ahead of time, before seeing the data, then the VC dimension of the model is .

Unfortunately, it is exactly in the construction of that your question comes in, because that is where the matrix factorization is going to be used. That is, is not fixed ahead of time before looking at the data; it is constructed after looking at the data using matrix factorization of the (typically sparsely populated) observed ratings matrix. So to compute the VC dimension you need to consider all the possible ways there are of constructing together with the linear model at the end - i.e. you have a cascade of with . This is similar to a 2-layer neural network, and in general there is no easy way to analyze the VC dimension of this cascade without knowing in detail the diversity of possible that can result from the matrix factorization. If you use a linear approach so where the weights are obtained through your matrix factorization algorithm, then this is just a cascade of linear models which is ultimately just a linear model in ,and the VC dimension is at most (in your notation). In practice it will be a lot better than this because you are not simultaneously tuning ; i.e. you are learning the factors in an essentially unsupervised manner and fixing them (similar to what was done for RBF learning).

Quote:
 Originally Posted by zzzzz Let's say we have some users (u) and some movies (m), and some ratings. The variable we control is d, a number of factors for each movie/user. We would like to know how many data points we need according to VC theory. This number probably depends on d, so in other words, what d we can afford given the data points we have. So, how do we go about calculating VC dimension here? Is it (u + m) * d?
__________________
Have faith in probability
#3 htlin NTU Join Date: Aug 2009 Location: Taipei, Taiwan Posts: 601 Re: How to calculate VC dimension for matrix factorization (Netflix-like) tasks

Assume that the input space to be all the entries in the by (binary) matrix, and each hypothesis is simply sign( ) for two matrices of rank at most . The following paper

http://ttic.uchicago.edu/~nati/Publi...kolaNIPS04.pdf

gives a combinatorial upper bound on the growth function, and thus the VC dimension is bounded by some log factors. Hope this helps.
__________________
When one teaches, two learn.

 Tags matrix factorization, netflix, vc dimension Thread Tools Show Printable Version Email this Page Display Modes Linear Mode Switch to Hybrid Mode Switch to Threaded Mode Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 02:06 AM.

 Contact Us - LFD Book - Top