LFD Book Forum  

Go Back   LFD Book Forum > General > General Discussion of Machine Learning

 
 
Thread Tools Display Modes
Prev Previous Post   Next Post Next
  #1  
Old 01-22-2014, 02:00 PM
rakhlin rakhlin is offline
Member
 
Join Date: Jun 2012
Posts: 24
Default VC dimension of time series models

Hello again dear Professor and all!

I want to determine VC dimension of time series models in order to avoid overfitting and estimate minimum size of data set.

1. First, maybe incorrect question as it does not articulate specific hypothesis space. A model takes input vector x_t = {r_{t-1};...r_{t-k}} of k lagged readings. Can VC dimension be approximately estimated as k?

2. Second, concrete time series model I'm working on, based on the article of Liehr and Pawelzik "A trading strategy with variable investment from minimizing risk to profit ratio" published in Physica A 287 (2000) 524-538.

Let me explain it briefly. Liehr and Pawelzik compare performance of two related models. Both models construct the series of input vectors by embedding the time series of returns into a space of embedding dimension k: x_t = {r_{t-1};...r_{t-k}}

a) discrete state model. Taking signs of k recent returns they get transformed into 2^k distinct states. For example, 5 lagged returns lead to 32 possible states. Each state produces forecast based on statistics of states like it.

b) RBF neural network. Training is performed by unsupervised adaptation of centers and subsequent gradient descent to adjust the second layer weights. For comparability, number of centers, Gaussians, is chosen equal to number of states 2^k in the first model.

Now, to my question. Liehr and Pawelzik do not use term 'VC dimension' but urge to avoid overfitting by using only a small number of Gaussians. In our terms they relate generalization ability to number of centers (RBF model) and number of states (discrete state model). They typically use 5 lagged returns which results in 32 states/centers. From Lecture 16 of this course I remember that number of centers in RBF model can be related to number of support vectors in SVM model. Number of support vectors for its turn is a proxy of VC dimension.

Am I correct, is VC dimension of the two models is approximately 2^k? Or just k?
Reply With Quote
 

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 02:50 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.