Training Set

The training set is used to “teach” the neural network the patterns that it should recognize. NeuroShell Trader creates the training set by calculating and retrieving the inputs that you have selected in the prediction input list.

For more information: What is Training Data?

Paper Trading (calibrating your model so that it has the best chance of working in the future)

Paper trading has a special meaning in the NeuroShell Trader Professional and DayTrader Professional. Paper trading is activated when you choose the option called “Save optimization which performs best on later paper trading”, either in the Prediction Wizard or the Trading Strategy Wizard.

Before we explain what it is, a little background is necessary about a modeling problem common to all model building with past data.

Any time you optimize there is a possibility that you may “over-fit”, which means that you build a model that worked so well in the past that it fit the past noise too, so that it probably will not work well into the future. (Editors note: some people use the word “curve fitting to describe the over-fitting phenomenon, but that is a misnomer because all modeling from past data is curve fitting – it is over curve fitting, i.e. or over-fitting, that is the problem).

Over-fitting can occur even if you do not optimize, because in fact when you backtest different strategies to see which works the best, you are in fact optimizing manually! However, machine optimizing increases the odds of over-fitting because it is so much more efficient. The possibility of over-fitting is reduced by not optimizing, optimizing over plenty of historical data, and/or optimizing as few parameters as possible.

One way that has been traditionally used to test if over-fitting has occurred is simply to see how the model holds up on new data. There are a couple of ways that can be done:

a. Of course you could simply watch your model for several weeks or months and “paper trade” it in the traditional sense to see if you would have made money. Few of us want to take the time to do that, however.

b. Another way to do this in NeuroShell is to use the check box called “Start trading before last chart date” in the Dates tab. That will allow you to select how much time to “hold out” of the optimization for evaluation of the model after it is optimized. Optimization will take place on an earlier period of the chart, and the final backtesting (evaluation) using the model will take place immediately after that. The disadvantage of this “hold out” approach is that even if you are satisfied the model held up during evaluation, you essentially have an old model, one that was not built on the latest data (although it was evaluated on the latest data). In order to use “Start trading before last chart date”, you must turn on the option that enables it from the menu Tools->Options-> Advanced.

Consider the approach b. (above) further. Using the evaluation of new data (called out-of-sample data by statisticians), what you will wind up doing is repeatedly optimizing models until you find the one that works the best on the new data, i.e. the one that shows up best during the evaluation (out-of-sample period). This is sometimes called “data snooping” because your evaluation data is not really out-of-sample anymore. Nevertheless it is still the most effective method to arrive at a model which has the best likelihood of holding up as you trade with it in the future.

Therefore, we have automated this process of building models, then evaluating them, keeping the one that works best during evaluation. The automated method is invoked when you check the box called “Save optimization which performs best on later paper trading” in the Dates tab. (Editor’s note: Ward Systems Group invented the method for use with neural nets years ago. We included it in all of our old software, and the technique, there called “calibration”, has been copied and adopted by other vendors and academia.)

Of course if you data snoop, statisticians will say that you still haven’t properly evaluated your model with real out of sample data. Of course statisticians usually assume normal distributions, and a number of other factors not present in market trading. However, if you want to build your model with paper trading and still satisfy the statistician in you, you can select both of the boxes:

“Save optimization which performs best on later paper trading”

“Start trading before last chart date”

That will enable saving the model which works best on paper trading, while still giving you a real out-of-sample period as described in b. above. The disadvantage is that you have an old model, at least as old as the out-of-sample trading period. Given that the market is frequently changing, and we suspect the number of statisticians who got rich in the market is quite small, we suggest that you consider using the paper trading feature without the added out-of-sample period.

Objective

The objective is the method used to train the net. NeuroShell Trader allows you to train the net using a variety of objective functions. The objective is the method that the prediction determines what the “best” neural net is during training and optimization. Please refer to Error Objective Functions and Trading Objective Functions for more information.

Minimizing Prediction Error is the default method of training a neural network and has historically been the method chosen to train neural networks. However, NeuroShell Trader offers a new and powerful set of ways to train neural networks, including a variety of ‘Training on Profit’ methods.

If you choose to train by Minimizing Prediction Error, or any of the other ‘non-Train on Profit’ objectives, you may specify Trading Rules to view how well the Neural Network would have traded. This can often be helpful in determining the usefulness of a prediction, because a prediction may have very little error, but may not be good enough to trade with. Contrarily, a prediction may have a large error, but may trade very well.

When selecting any of the ‘Training on Profit’ objectives you must specify the associated trading rules (Both Long and Short, Long Positions Only, or Short Positions Only, as well as the trading rule thresholds). These rules can be as important to making a profitable network as selecting the proper inputs (obviously rules that are too strict, might result in a trading strategy that never makes any trades and thus never makes any money).

Additionally, you may select Find the optimal trading rules, which will automatically find an optimal set of trading thresholds for you. This method is recommended because you may find a set of inputs that predict very well, but have the incorrect thresholds, and therefore the prediction performs poorly.

Note that if you are trading Both Long and Short positions and your thresholds overlap (e.g., Long Exit < 0 and Short Entry < 1), you will not enter into the opposite position until after you have exited the first position (e.g., you won’t go short until you exit your long position first).

The advanced tab can play a very important role in how optimization is accomplished. Since these are professional features, you have some flexibility in determining how the program works with these features. You also have a responsibility to experiment and find out what is right for you; we cannot do more than provide general guidelines.

Realize that during optimization, the genetic algorithm is choosing inputs, parameters, and thresholds (depending upon your selections) and then training and applying a network with each such choice. Hundreds or even thousands of networks might be trained.

Understand too that the Turboprop 2 network paradigm trains by adding hidden neurons one by one, up to a maximum of 80. Zero hidden neurons produce a linear model, and the model becomes more and more non-linear the more hidden neurons are added. Generally speaking, the more hidden neurons added, the “tighter” the fit becomes and the longer the training takes. Previous practitioners of older types of neural nets like the primitive “backpropagation” nets are cautioned not to try to equate Turboprop 2 performance with a certain number of hidden nodes with backpropagation performance using the same number. These are very different algorithms.

The sliding bar used to adjust the number of hidden neurons is labeled Number of hidden nodes during training.

Next, we will discuss the small check box marked Adjust training set for trending markets by evenly distributing training bars. Understand that neural networks are pattern recognition devices: they make predictions based upon what happened during training when they encountered similar patterns. If your training set covered a period of a strong and sustained bull or bear market, the majority of the patterns are likely to be all in one direction. Neural nets aren’t likely to predict the other direction well at all in such a case. Ideally, you’d like to have about half of your training patterns showing up trends, and the other half showing down trends, to varying degrees.

Turning on this option will cause your training sets to be automatically balanced in this way. If you are predicting some output that shows both positive and negative numbers (like percent change), the algorithm will produce a training set having roughly the same number of both signs. If your output is an indicator that always has the same sign (like an indicator that is always between 0 and 100) then the algorithm will select even numbers above and below the mean based on the data’s distribution.

Selecting the Shortest Average Trade Span option causes the optimization to give preference to neural networks with an average trade span greater than or equal to the Shortest Average Trade Span over the back testing period. Use this option is to decrease the number of trades if you find that optimization produces too many trades over the optimization period. It is recommended that you choose this option only if you are unable to achieve your goals using other methods.

Selecting the Longest Average Trade Span option causes the optimization to give preference to neural networks with an average trade span less than or equal to the Longest Average Trade Span over the back testing period. Use this option to increase the number of trades if you find that optimization produces too few trades over the optimization period. It is recommended that you choose this option only if you are unable to achieve your goals using other methods.

The average trade span is the average number of bars that all of the trades are in over the optimization period. An acceptable average trade span may produce undesirable trades by having one very long trade and several short trades. Selection of the Shortest and Longest Average Trade Spans are not necessarily the answer to your problems.

Finally, if you have selected the Input Selection or Full Optimization mode of optimization, you may select the Maximum number of inputs that the GA is allowed to use. The GA may determine that less is better. Remember that the more inputs in a network, the greater the chances of “over-fitting”.