![]() |
#11
|
|||
|
|||
![]()
I have a couple of points, based on not dissimilar experiences of my own.
First, are you concentrating on the calculated errors on your out of sample data as you train the neural network? In sample errors are not easy to draw conclusions from (unless your data set is very large compared to the complexity of the neural network). I am not sure what software you are using, but in JNNS for example, you can see a graph of OOS errors as you are training. Secondly, as a simple test, you could try repeating the training with the input data replaced by entirely random data (but keeping the same output data) to see the comparison. |
#12
|
|||
|
|||
![]()
Once it gets into that situation, the results are the same both in-sample and out-of-sample. It sounds like an interesting idea to try the training with random data, to see if there is any issue with the input. Thanks.
|
#13
|
|||
|
|||
![]() Quote:
This relationship P(y | x) comprises a deterministic part plus noise which arises in different ways but has the same effect on the statistics. In my experience, if you provide inputs that are explicitly independent of the outputs (so the outputs are independent of the inputs and P(y | x) is entirely random noise), a neural network will generally converge to a constant function whose value is the average of the outputs. The reason is that this function gives the absolute minimum RMSE. If a neural network converges to anything else in this case, it must be fitting the noise. This is unlikely to happen unless there is a small number of input points compared to the complexity of the neural network. I should make clear that my understanding of the above is empirical with a core of simple probability theory. The detailed behaviour of neural networks is very obscure, and I am glossing over issues such as local minima, merely because I haven't seen this confusing the issue and suspect it generally won't. As for the technical details, it is useful to monitor the RMSE errors on out of sample data as a neural network is being trained, because this helps distinguish between the useful effect of training (generalisation) and the bad effect of training (overfitting). This applies whether there is a deterministic relationship between inputs and outputs, a noisy relationship, or even when they are totally independent (in this case, a network can first model the average, but then may learn the random noise in a way which increases out of sample RMSE. Can you describe the nature of your data? Is it financial time series data, perhaps? |
![]() |
Thread Tools | |
Display Modes | |
|
|