Short Training/Eval Periods and Differences Between the Prediction Wizard and Adaptive Turboprop 2

Short training periods are always a bad idea. Even with few inputs, short training periods are likely to cause “overfitting”, meaning the net does very well with training data, but is less likely to generalize or do well on future “out-of-sample” data.

If you decide to ignore this warning then at least use the Adaptive Turboprop 2 (AT2) add-on. The reason is that AT2 handles training sets in a slightly different way than the Prediction Wizard (PW). If the look ahead period (how far ahead you are predicting) is N bars, then the most recent N bars can’t be in the most recent training set, since the actual answers for those bars aren’t known yet.

The PW leaves those N bars out, but keeps them in the earlier walk forward training sets. Doing that gives the earlier out of sample periods an advantage if they are short because the nets they use have been trained with N bars of more recent data – inputs therefore very close to the ones in the out-of-sample set. Older nets may show slightly better results than those same nets will get when retrained right up to the current day. This advantage is greater when the size of the training set OR the evaluation set is quite small, or the lookahead period is large. It is almost negligible when the training set is a reasonable size (at least two years of daily data ) and the walk forward evaluation set is also reasonable (at least 3 months of daily data) and reasonably short lookahead bars (2 to 10 days in advance).

AT2 shifts all the training sets back N bars, so all training sets use exactly the same amount of data, and the earlier backtesting has no particular advantage of more recent data. Therefore, AT2 may not look quite as good during the backtest as the PW, especially with short sets, but probably the training is more realistic.

Furthermore, since AT2 easily retrains as frequently as every bar, short training sets are a little less likely to cause overfitting in AT2.

Of course, AT2 is limited to 20 hidden neurons, but training with the full number of hiddens is also very bad for small training sets, since overfitting is even more likely.

By the way, the Adaptive Net Indicators add-on also shifts the training set by the lookahead amount

Was this article helpful?

Related Articles