Dependent Data
The independence of data seems to be curtail for both theoretical analysis and practical efficiency. What if the sample (x1,y1)...(xN,yN) consists of correlated points? For example, x1....xN is a realization of a Markov chain. Can we still learn from these data? Do we need to change the standard learning algorithms to account for the dependence? Is it possible to introduce a notion of "effective" number of data points N'<N and then work with the sample if it were independent of size N'?
