LFD Book Forum  

Go Back   LFD Book Forum > Book Feedback - Learning From Data > Chapter 1 - The Learning Problem

Thread Tools Display Modes
Prev Previous Post   Next Post Next
Old 09-18-2012, 01:12 PM
gah44 gah44 is offline
Invited Guest
Join Date: Jul 2012
Location: Seattle, WA
Posts: 153
Default Data independence

I was recently thinking about the Facebook friend suggesting algorithm,
though I think that the problem could also apply to Netflix.

The assumption is that data points are independent, and so contribute equally to the solution.

In the FB case, if I am friends with more than one person in a family, it has a strong tendency to suggest other friends of the family, stronger than it should. (Though FB doesn't necessarily know that they are related.)

In the Netflix case, if someone likes Spiderman 1, Spiderman 2, and Spiderman 3, that really isn't three independent samples. On the other hand, Spiderman 1 and Batman 1 should be considered more independent.

It seems to me that there should be enough in the data to extract some of this dependence.
Reply With Quote


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT -7. The time now is 05:14 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.