Learning to Appreciate Turboprop 2’s Ability to Extrapolate

Sometimes, Turboprop 2 nets produce extreme values for predictions that, on the surface, could look wild. At least one long time sophisticated financial user compared these results to results from an older version of NeuroShell with the “backprop” algorithm. He was convinced backprop did better on his data. Eventually, he accepted the fact that it did not. Let us explain.

First of all, many algorithms (like GRNN) will not extrapolate at all, and others (like backprop) only a little, but Turboprop will as best it can. The user had input patterns unlike what was in his training set. Turboprop correctly extrapolated and the answers looked “wild”, as he put it. The other methods don’t extrapolate well, so they produced a result in the range of the training set outputs. But Turboprop is the one which was producing the correct output for the input data (even if it was not the answer he wanted) because he didn’t have an appropriate training set. The backprop answers looked better because they were more in line with other outputs. When he put more representative data (like the ones that went wild) into his training set, he found that the answers were not “wild” anymore.

In other words, the user was getting answers with Turboprop that were more correct (given the training set and inputs) but not what the user wanted to see. The backprop answers, though wrong according to the training set and inputs, were closer to what he wanted to see. Thus, the false conclusion was that backprop was doing better.

To illustrate this, consider the function which takes one input and squares it. We trained it on the following data using TurboProp 2 in the NeuroShell Predictor, GRNN in NeuroShell 2, and backprop using NeuroShell 2 (with and without a linear activation function in the output layer). In NeuroShell 2, we used the <<-1, 1>> input function instead of clipping the inputs. Here is the training set:

input output
1 1
2 4
3 9
4 16
5 25
6 36
8 64
9 81
10 100

 

Then we evaluated (completely out of sample) on the following:

input

11

12

13

 

Here were the results:

input correct Turboprop 2 GRNN Backprop Backprop-linear
11 121 122.23 100 112 117.47
12 144 146.94 100 123.78 135.89
13 169 174.42 100 133.27 153.63

 

As you can plainly see, only Turboprop 2 extrapolated well. It did over-shoot a little, but not nearly as much as the others under-shot. Not bad for a non-linear model.

But suppose the answers you were looking for were:

input correct
11 107
12 115
13 120

 

The answers you were looking for are clearly not supported by the training data. In fact, no answers are supported by the training data really, because no training data is greater than 10. Nevertheless, you see that GRNN and regular backprop didn’t do all that badly, and Turboprop 2 answers are “wild”. So you conclude that GRNN and backprop are doing better on your problem.

Now imagine that you have 30 inputs in a trained net, and a new pattern comes in to be evaluated. Suppose the pattern (in Euclidean distance) is quite far from the closest pattern in the training set. Turboprop 2 is going to do its best to extrapolate. Furthermore, it is extrapolating in a 30 dimensional input space. The other models will fall flat. But you haven’t done your job, because you are asking the net to predict something for which there are no close patterns. You will think the flat models are “predicting” better, if they are closer to the answer you wanted.

Obviously, at some point Turboprop 2 may be extrapolating more than it should, because you really can’t extrapolate a non-linear model unless you actually know the formula. At this point, neither the Turboprop 2 nor the other models will be giving accurate results, but the other models may still look better because they are flat.

SUMMARY: If Turboprop 2 in either the Trader or the Predictor produces extreme results, you have no doubt asked it to predict well beyond the range of its training set (in Euclidean distance). It is trying to tell you something important, which you should not write off as bad answers. You will usually be making a big mistake if you conclude that some other neural network paradigm is working better just because it has produced closer answers. It will be hard for some of you to swallow, that farther answers can mean a better model, but if you think about it you will eventually understand.

Question from user:

I have a data series that is very volatile but tends to cluster for brief periods of time. The whole series is positive and can never be negative. When I run a prediction there are times when the prediction delivers a negative prediction. How can I prevent a negative prediction from occuring? I have tried using the log of the number in the indicators but still sometimes I get a negative prediction.

Answer:

Predictions are going negative because TurboProp2 is extrapolating. Basically, as you go forward in time it is seeing data that is unlike what it has seen (learned) before.

Here are several solutions:

1. View the negative numbers as flags that the net is extrapolating and therefore the predictions should not be trusted at those times.
2. Train the nets on data that shows more diversity in the values of the inputs.
3. Clip the prediction at 0. Use the If-then rule roughly this way: IF prediction <0 THEN 0 ELSE prediction. 4. Use Adaptive Net Indicators, which NEVER extrapolate. ANI is one of our addons. 5. Neural Indicators (another addon) don't extrapolate either, but they go from -1 to 1, but they would be a good choice too because they don't predict anything.

Was this article helpful?

Related Articles