View Single Post
Old 08-25-2012, 12:55 AM
magdon's Avatar
magdon magdon is offline
Join Date: Aug 2009
Location: Troy, NY, USA.
Posts: 597
Default Re: Recency weighted regression

Originally Posted by itooam View Post
I haven't read your book just doing the online course, I see this thread has been moved from the "general homework" forum to "Chapter 3" of the book forum. If "recency weightings" are explained in your book (please could you confirm?) then I will scour the earth for your book as this area is of much interest. Previously I looked for your book on but couldn't find, maybe I can order internationally through .com or some other shop.
The book does not specifically cover weighted regression; but it does cover linear models in depth. And yes, you can find the book on; unfortunately it is not available on

With respect to your question though, you seem to be confusing two notions of recency:

Let's take a simple example of one stock, which can generalize to the multiple stocks example. Suppose the stock's price time series is


At time t for t>3 you construct the input


and the target y_t=P_t. You would like to understand the relationship between \mathbf{x}_t and y_t. If you know this relationship, you are can predict the future price from previous prices. So suppose you build a linear predictor

y_t\approx \mathbf{w\cdot x_t}.

The learning task is to determine \mathbf{w}. To do this you minimize

E_{in}=\sum_{t>3}(\mathbf{w\cdot x_t}-y_t)^2

You will probably find that the weights in \mathbf{w} are not uniform. For example the weight multiplying P_{t-1} might be the largest; this means that the most recent price P_{t-1} is the most useful in predicting the next price y_t=P_{t}.

The notion of recency above should not be confused with recency weighted regression which is catering to the fact that the weights \mathbf{w} may be changing with time (that is in the stock example, the time series is non-stationary). To accomodate this fact you re-weight the data points giving more weight to the more recent data points. Thus you minimize the error function

E_{in}=\sum_{t>3}\alpha_t(\mathbf{w\cdot x_t}-y_t)^2

The \alpha_t enforce that the more recent data points will have more contribution to E_{in} and so you will choose a \mathbf{w} that better predicts on the more recent data points; in this way older data points play some role, but more recent data points play the dominant role in determining how to predict tomorrow's price.

Thus in the example of time series prediction, there are these two notions of recency at play:

(i) more recent prices are more useful for predicting tomorrows price

(ii) the relationship between this more recent price and tomorrows price is changing with time (for example sometimes it is trend following, and sometimes reversion). In this case, more recent data should be used to determine the relationship between today's price and tomorrow's price.
Have faith in probability
Reply With Quote