LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   General Discussion of Machine Learning (http://book.caltech.edu/bookforum/forumdisplay.php?f=105)
-   -   Time Series method similarities (http://book.caltech.edu/bookforum/showthread.php?t=4650)

ksokhanvari 01-18-2016 01:45 PM

Time Series method similarities
 
Dear Professor Abu-Mostafa

First I like to add my thanks and appreciation to countless other messages that you surely have received about this wonderful course. I am 53 years old and although I focused on AI techniques during my Master’s Degree in computer science, when I was younger :), I did not have the quality of understanding I have gained after completing this course. AI has come a long way in 25 years and I am very excited to have discovered this class online. Congratulation to you and the Caltech community for this high quality work.

I do have a question regarding an application area regarding financial market forecasting. I have been working in this area for the past 10 years applying the typical methods of time series analysis to the problem of forecasting time series quantities. It seems to me that while time series analysis in the literature is covered as a separate field entirely, the application of ARIMA and GARCH models and the parameters fitting procedures found in the literature and software libraries share a significant amount of theoretical overlap to machine learning theory.

Could you please comment on how would you map these techniques (similarities and differences) to the machine learning map you presented? In particular, it seems the data handling and validation procedures should be the same. The ARIMA and GARCH models are just another hypothesis set model. The fitting procedures are similar to learning algos. The minimum AIC or Maximum likelihood model selection procedures is similar to regularization and VC dimension analysis, Occam razor etc., etc.

Additionally, in your experience given that today machine learning techniques are often applied to financial forecasting domain -- have you found a typical algorithm, i.e. NNs, SVMs, etc. have typically a better performance in this domain? the evidence in the literature is not clear to me.

I promise to charge the appropriate VC dimension cost to my solution sets ! :)


Thanks again,

Regards,

Kamran Sokhanvari

yaser 01-21-2016 01:08 PM

Re: Time Series method similarities
 
Thank you for your post. This is an interesting question that touches on parametric versus non-parametric methods, as well as specialized models versus generic models (as well as other issues). Let me suggest that you take a look at the parametric versus non-parametric discussion in e-Chapter 6 (section 6.2.4), and then continue the dialog in this thread.

ksokhanvari 01-26-2016 03:27 PM

Re: Time Series method similarities
 
Dear Yaser,

Thanks very much for your response. I did take a look at e-chapter-6 and distinction between parametric and non-parametric models.

However, to clarify my question I was also wondering about the overall relationship between the key components of the “learning theory” and the techniques used in machine learning with the more traditional methods of fitting polynomial models to data.

Specifically, in the domain of Time Series Analysis we fit a polynomial of the time series (e.g. ARIMA models) using the input value and its previous values (X(t-1), X(t-2), X( t-3), …) for the AR component and the forecast error values (e-1, e-2, e-3, …) for the MA components and once fitted we proceed to use such a model to forecast values for X(t+1), X(t+2), etc.

Therefore, we are just fitting (i.e. learning the parameters from previous examples) a linear parameter polynomial with a view that the time series values are related and time lag correlated with a decay built in as we move away from the recent values.

There are two main questions for me,

1) Given the above explicit assumption about the nature of the data in time series -- are the more generalized models such as NNs, SVMs and high dimensional feature regression models have better generalization properties than traditional time series models?

2) Given the procedures for properly implementing machine learning techniques such as the use of regularization to avoid over-fitting, or VC dimensional analysis for understanding the number of examples needed, or application of cross validation sets for parameter selection and out of sample error estimate measures – don’t these areas theoretically overlap with methods used in fitting polynomials in time series model analysis?

I am trying to extend what we have learnt in this course and understand areas of theoretical and fundamental overlap and true differences between domains and methods.


Many thanks

magdon 01-27-2016 07:36 AM

Re: Time Series method similarities
 
To answer your questions.

1. The more general non-linear models (including doing a feature transform) may or may not be better. It depends on your time series and whether the linear dependency on prior X's and prior residuals is a good model for the process. One thing to beware is that having both the X's and the prior residuals can result in a lot of parameter redundancy and overfitting. Using non-linear models is recommended if the dependency is more complex; the caveat is that such models are easier to overfit and there may be no convenient "closed form" technique to estimate the parameters.

2. Yes, the general setup is the same, and you are well advised to use regularization and care in choosing the "size" of your ARMA (i.e. how many time steps in the past to auto-regress onto).

HOWEVER, the theory covered in this book is not completely applicable to time series methods and a more detailed theoretical analysis needs to be performed to account for the fact that the training data are NOT independent. This becomes especially so if you generate your data points by moving 1-step forward. For this reason, most of the theory regarding time-series models starts by assuming that the process follows some law with (typically) Gaussian residuals. Then one can prove that certain ways of estimating the parameters of the ARMA model are optimal, etc. In the learning framework we maintained that the target function is completely unknown and general. So the ARMA type models would more appropriately be classified as "statistics-based"-models (see Section 1.2.4)


Quote:

Originally Posted by ksokhanvari (Post 12247)
Dear Yaser,

Thanks very much for your response. I did take a look at e-chapter-6 and distinction between parametric and non-parametric models.

However, to clarify my question I was also wondering about the overall relationship between the key components of the “learning theory” and the techniques used in machine learning with the more traditional methods of fitting polynomial models to data.

Specifically, in the domain of Time Series Analysis we fit a polynomial of the time series (e.g. ARIMA models) using the input value and its previous values (X(t-1), X(t-2), X( t-3), …) for the AR component and the forecast error values (e-1, e-2, e-3, …) for the MA components and once fitted we proceed to use such a model to forecast values for X(t+1), X(t+2), etc.

Therefore, we are just fitting (i.e. learning the parameters from previous examples) a linear parameter polynomial with a view that the time series values are related and time lag correlated with a decay built in as we move away from the recent values.

There are two main questions for me,

1) Given the above explicit assumption about the nature of the data in time series -- are the more generalized models such as NNs, SVMs and high dimensional feature regression models have better generalization properties than traditional time series models?

2) Given the procedures for properly implementing machine learning techniques such as the use of regularization to avoid over-fitting, or VC dimensional analysis for understanding the number of examples needed, or application of cross validation sets for parameter selection and out of sample error estimate measures – don’t these areas theoretically overlap with methods used in fitting polynomials in time series model analysis?

I am trying to extend what we have learnt in this course and understand areas of theoretical and fundamental overlap and true differences between domains and methods.


Many thanks


ksokhanvari 02-01-2016 11:33 AM

Re: Time Series method similarities
 
Professor Magdon,

Thanks for your response to my question. I think your clarification regarding the independence assumption of the observations is key. However given the extreme noisy nature of the financial time series the Gaussian residuals is also an assumption that does not always hold as you know.

I am going to try and build several of the models and compare the results to see which ones have a better fit characteristics.


Thanks again.

Kamran

pdsubraa 08-26-2017 06:14 AM

Re: Time Series method similarities
 
@Kamran - First of all- Thank you for showing great respect for the Professor!

Do let me know once building several of the models and comparing the results!

I am not a expert in this field - just researching on my personal interest unofficially!

Thus consider me as a laymen - Thanks in advance!


All times are GMT -7. The time now is 03:03 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.