View Single Post
  #4  
Old 09-19-2012, 03:57 PM
magdon's Avatar
magdon magdon is offline
RPI
 
Join Date: Aug 2009
Location: Troy, NY, USA.
Posts: 597
Default Re: Data independence

Yes, if you select spiderman 2 because you first selected spiderman 1 then this is indeed non-independent sampling which is even worse than just having a mismatch between training and test probability distributions. In such cases, there may even be "effectively fewer" data points whenever non-independent sampling takes place.

Quote:
Originally Posted by gah44 View Post
Yes. But one might choose Spiderman2 not because it is the type that they like, but because it is a sequel to Spiderman 1. Maybe the two should count more than 1 movie, but not quite as much as two independent movies would.
__________________
Have faith in probability
Reply With Quote