What is Training Data?

The training data includes many sets of input variables and a corresponding output variable. If you’re familiar with statistics, the inputs are often called independent variables and the output (prediction) is called the dependent variable. Each set of corresponding independent variables and dependent variable is called an observation, example, or case. In more general terms, when the neural network trains, it is using historical examples to “learn” the patterns of the input variables and how they correlate to the output variable (prediction).

The range of historical examples (“training set”) used to train the network should include a representative set of problems likely to be encountered in the real world. For example, if you want to predict the selling price of a stock, you need to make sure your training set includes historical examples of when the price went up, when it went down, when it stayed the same.

You will want to provide historical examples that are relevant to predicting the current market and avoid historical examples that do not represent current market behavior. The best way to do this is to limit the amount of past history upon which the neural network trains (“learns”). This means limiting the number of observations to between 300 and 2000 so that the neural network will learn data “relevant” to today’s market.

Additionally, it is important to understand that for each observation all the inputs must exist for the neural network to use that observation to “learn”. If you are trying to predict a U.S. security and using an input that comes from a foreign market that has a holiday on a day that you are expecting an observation, then the observation will be missing data. Thus the neural network will be unable to use this observation for “learning” or predicting its value.

Finally, because of inflation and an overall rise in most markets, normalizing inputs and outputs is extremely important. For more information on normalizing data refer to Normalizing Variables topic in Neural Network – Output Discussion

The results that you achieve will only be as good as the training data (inputs and outputs) that you select. For more information on how to choose your training data refer to Neural Network – Output Discussion and Neural Network – Input Discussion.
 

Was this article helpful?

Related Articles