January 12, 2015 |
Differences between Neural Nets and ChaosHunter Models |
Unlike neural network trading models that predict an output such as percent change in open, ChaosHunter is NOT looking for a formula that is trying to match some output in a trading model. It is looking for a formula that makes the most money when trading rules are applied to the values the formula is producing. The “Output” that you choose is supposed to be a price time series that CH will use to determine fill prices, i.e., what prices you get when you buy and sell.
Once the inputs, an output, and functions are selected, ChaosHunter begins a selection process that results in an equation that generates trading signals. As a consequence of producing equations more meaningful to humans, the ChaosHunter equations sometimes do not match the existing data as closely as neural networks, although ChaosHunter™ often finds underlying relationships that extrapolate better into the “out-of-sample” unseen data. In other words, they are often more robust. In NeuroShell Trader, one solution for overfitting is to reduce the number of “free variables” or inputs to the model, so that the model does not actually learn the “noise” instead of the underlying patterns. Neural nets typically try to use in some way all of the information they are given. ChaosHunter does not suffer anywhere nearly as much from the problem of free variables, because it naturally tends to keep only a few variables, even if you have presented it with too many inputs. So in ChaosHunter that is less of a worry, but in general, we would give ChaosHunter only what we feel are the most pertinent indicators so it doesn’t waste time sifting through the chaff to get to the wheat. ChaosHunter also does not suffer from the problem of overoptimization, in our opinion, given that you have enough data as described below.
Selecting Data Sets 1. A “diverse set” is one that has many various types of patterns, so that the formula ChaosHunter produces will have considered many situations that could occur in the future. We would then call the formula “robust”. Now a large set will enhance the probability of getting diverse patterns captured, but will not guarantee it. So you should manually examine the price curves in your set to make sure it shows lots of rising, falling, and volatile patterns. 2. A “relevant set” is one that contains the patterns that are most likely to occur in the future, as opposed to ancient patterns no longer likely to recur. In the mid nineties you could have purchased almost any Internet stock and made a fortune, but very few would advise that pattern of buying today. If you are building intra-day models, make a similar analysis without going back so far in time. You might also want to consider building models with different bar sizes. A 10 or 15 minute bar may eliminate some of the noise found in a 5 minute bar, for example. At the end of the day we cannot give you an exact cookbook of how to choose your data periods, because like all of trading, it is more an art than a science. Experiment and come to your own conclusions about what is best for the stocks or other issues you are dealing with. Keep in mind that not every issue is always predictable (has repeating patterns). Sometimes stocks and markets change based on news, and totally new patterns will appear. For more information on choosing data sets, look for the topic “Selecting in-sample and out-of-sample periods – Commentary by Steve Ward” that appears in the Changes in documentation section in both the NeuroShell Trader and ChaosHunter sections of www.ward.net. Evaluate Your Model on Out-of-Sample Data Because neural nets and ChaosHunter are sophisticated modeling tools, it is often easy to get excited when you see profits generated by your models. However, before you order that new sports car, make sure your model holds up in out-of-sample data (data not included in building the model). In NeuroShell Trader both the prediction and trading strategy wizards include a dates tab that allow you to separate your data set into an optimization (training) and a trading (out-of-sample) data set. ChaosHunter gives you the same option to split data sets on the datagrid. Simply click on the “Select Ranges” button. You can either select row numbers or use a graph of the data to specify dates. During optimization, ChaosHunter lets you look at the trading signals graphed on both the training and out-of-sample data sets so you can select a model that performs well on both data sets. In either program, look for a model that produces buy/sell signals on peaks and valleys in the data and generates enough trading signals to give you the profit figures are not a fluke based on a few trades. |