Yes, if you select spiderman 2 because you first selected spiderman 1 then this is indeed non-independent sampling which is even worse than just having a mismatch between training and test probability distributions. In such cases, there may even be "effectively fewer" data points whenever non-independent sampling takes place.

Originally Posted by gah44 View Post
Yes. But one might choose Spiderman2 not because it is the type that they like, but because it is a sequel to Spiderman 1. Maybe the two should count more than 1 movie, but not quite as much as two independent movies would.
