At least once a week, a Google Scholar Alert pops into my inbox featuring another bored academic that though they’d try using their “expertise” to “predict the stock market”. Said paper tends to follow this structure:
- Starts with a paragraph about how humans are obsolete and that the market is being run by computers.
- Next comes the ‘novel’ machine learning technique, which is really just a well known algo. (SVM, neural net., logistic regression) with an adaptive learning rate.
- Some (usually 5) simple technical analysis indicators are used on unprocessed daily price data to create some features for the new super-algo.
- Said algorithm is applied to predict whether price(t+1) > price(t) or vice versa.
- Results report prediction accuracy of ~65% (And we know they tried 100 different stocks before settling on the 3 that they reported results for).
- There are usually no out of sample results at all!
Some will stop here, reporting that they’ve cracked it and that their technique beats all others. If we’re lucky, however, they’ll go on to assume that they can trade at the close price of each day, buying when they predict upward movement and selling before a fall with a constant size strategy, reporting annual returns of 40%.
Putting aside the lack of out of sample results and non-existant estimation of transaction costs, they’re really missing the point. Attempting to predict the price return over the next 24 hours is not the best way to go about using machine learning to create investment opportunities.
Predicting the direction of price movement is certainly an essential part of automated trading models. But it’s not the only part. A simple automated trading model is better based around event detection: bull, bear or neutral markets, insider trading or institutional trading. Detecting such schemes can easily be approached with either binary classification of hypothesis testing.
In binary classification, a feature matrix, X of size (N*K), is used to predict a binary vector, Y of size N. In this case we would interpret each row as representing one day with K features that we hope relate in some way to the occurrence of the market event that we are trying to predict, Y. Once in this format, binary classification is a simple matter of applying one of the many machine learning algorithms to map each row, Xi to Yi. If our out of sample predictions are consistent, we have a means of quickly identifying regimes and adjusting our trading strategy accordingly. However, given that this kind of binary classification is a standard machine learning problem, it suffers all the pitfalls of overfitting.
For hypothesis testing, we make an assumption about the distribution of an observable variable before applying a test for low probability events. As an example, let’s say we’re into high-frequency volatility trading and we’re looking for unusual activity in a stock. We can describe the number of orders arriving per second with a Poisson distribution, X, and the individual order volume * price with a Gamma distribution, Y. Now we can fit X and Y through maximum likelihood estimation and calculate the probability, p, of the market acting “as normal” each second. Taking p<0.05 as a rare event, if we see more than 5 such events in a 100 second period we might be tempted to make a move. It’s important to bare in mind, however, that hypothesis testing assumes structure in data and thus requires a stationary distribution for decent results.
These kinds of techniques produce far more stable predictions than forecasting daily returns and provide us with information upon which it is much easier to trade.