Over-fitting can happen to any kind of model, not just neural nets. It is when the model works great on the data with which it was built (the training set), but then does poorly on the out-of-sample evaluation sets. Basically, it has learned the “noise” in the data, or it has “memorized” the patterns instead of “learning” the relationships. Taken to extremes, any good modeling technique will memorize the patterns if there are the same number of patterns as there are inputs. Even linear regression will do this: a straight line will fit any 2 points exactly, and a plane in three dimensions exactly fits any three points. You couldn’t ask for a better model, but if there is more data, it may not still fit the same pattern.
How do I know when I am over-fitting?
You have to do several walk-forward tests and make sure that the evaluation sets do well compared to the training sets. Of course, they usually won’t do AS well, but you’d like them to be not too much worse on average error or profit.
So if my evaluation sets are poor and my training sets are good, I’m over-fitting?
Not necessarily; there are other reasons that can happen. One is that the market has changed and what worked well during training just didn’t work well on evaluation.
What can cause over-fitting, and how can I avoid it?
To an extent, the Turboprop 2 paradigm in the Trader is built to prevent over-fitting. That is why other nets, like those in NeuroShell 2, sometimes do better on the TRAINING set. That is also why performance on the training set is largely irrelevant to measuring how good a neural network is; you have to see how well you are doing on out-of-sample data. However, even Turboprop 2 can over-fit. It usually happens when there are too many inputs or too few training bars, or both. However, increasing the size of your training set means possibly going back in time to a time when the market was very different. It is a balancing act. Therefore, the easiest and safest way to prevent over-fitting is to keep your number of inputs small.
Anything else I can do?
As a matter of fact, “noise” in financial markets is just movement we don’t have variables to explain. Our web page example shows that if you have the right variables, there is no noise and thus no over-fitting. So if you can get better inputs, that will help too.
How does walk-forward testing help in this?
As previously mentioned, if your evaluation sets on walk-forward testing look good, you’re probably ok. How good is good? That’s a judgment call, and depends on several things like your expectations, the particular issue, how good the training set is, etc. In the old days of neural nets, before the Trader made walk-forward testing so easy, people didn’t do it much, and they traded with nets that looked good on training. They got burned, and neural nets sometimes got a bad rap because of it.
How many walk-forward tests should I do, how long should they be, and do they all have to look good?
More judgment calls. There’s no science here. If you go back too far in time, you may be in a different market scenario which is irrelevant to predicting tomorrow. Therefore, you may decide it’s ok if old evaluation sets look poor, but recent ones look good. You may decide to do only one walk-forward for 6 months. You may decide to do four or five and look for half to be good, and the rest to be not unreasonably bad. It is really up to you.
How does the Trading strategy fit into all of this? Can it over-fit too?
Yes, you can over-fit that too. If you back-test only on a short period of time, and then tediously look for the exact thresholds that maximize trading, you run a risk of over-fitting again. You might want to leave some recent period off of the back-test when deciding on good thresholds, and then see if a reback-test using all the data looks good too. Alternatively, don’t fit your thresholds too tightly. However, if you are using the out-of-sample net, and you are setting thresholds over a long period, you may not have to worry too much about overfitting the trading strategy.
All this seems fuzzy and subjective.
It is; don’t look for hard rules. These are fuzzy rules. The bottom line is that when we predict markets in the future, all we have to go on is how well our system did in the past. When you get right down to it, the past is not guaranteed to be prologue to the future, but it is all we have. You could have great evaluation periods, but the net can fail miserably because the situation changes and your variables are no longer good tomorrow. Only YOU can decide when you are ready to risk your money on an automated system. But don’t look for perfection, or you’ll never start trading!