Deep Learning on AWS GPU instances with Python and TensorFlow – Part 1

This multipart tutorial will show you how to:

  • Launch a GPU instance on AWS, SSH into it and set up TensorFlow.
  • Blah

I will assume that you have an AWS user account with admin rights and have downloaded the accessKeys.csv file.

Setting up AWS on your local machine

First, install the AWS command line tool using the python pip installer (for other options, see here).

 sudo pip3 install awscli 

Now you should be able to use the aws command:

$ aws
usage: aws [options] <command> <subcommand> [<subcommand> ...] [parameters]
To see help text, you can run:

  aws help
  aws <command> help
  aws <command> <subcommand> help
aws: error: too few arguments

In order to In order to use the aws account, we have to provide the right credentials. We can do so running aws configure and inputting the details from you accessKey.csv file:

 $ aws configure
 AWS Access Key ID: <acces_key_id>
 AWS Secret Access Key: <secret_access_key>
 Default region name [us-east-1]: us-east-1
 Default output format [None]: <ENTER>

Your config will be stored in ~/.aws

You can test whether you user has admin by running:

$ aws ec2 describe-instances --output table
-------------------
|DescribeInstances|
+-----------------+

If successful you should see the above. Now create an access group ‘my-sg’ and set access rights with ssh access:

$ aws ec2 create-security-group --group-name my-sg --description "My security group"

$ aws ec2 authorize-security-group-ingress --group-name my-sg \
  --protocol tcp --port 22 --cidr 0.0.0.0/0

Next, Create the ssh access key and save it to  ~/.aws/my_aws_key.pem

$ aws ec2 create-key-pair --key-name my_aws_key \
  --query 'KeyMaterial' --output text &amp;amp;amp;gt; ~/.aws/my_aws_key.pem

  chmod 400 ~/.aws/my_aws_key.pem

Now we are ready to launch our EC2 instance 🙂

Launch Ubuntu 14.04 GPU instance

I like to use a basic Ubuntu 14.04 instance (image-id = ami-fce3c696)

$ aws ec2 run-instances --image-id ami-fce3c696 \
  --count 1 --instance-type g2.2xlarge \
  --key-name my_aws_key

Assuming all has gone well, you should now have an instance up and running!

Increase the size of the file system

The first time i did this, I ran into issues with not having i=enough space on the file system to install everything I needed. Follow the steps below to resize the filesystem.

  1. From aws console, stop the instance
  2. From aws console, detach the volume (though note the mount point under attachment info, eg /dev/sda1)
  3. From aws console, take a snapshot of the volume
  4. From aws console, create a new volume using the snapshot (for my g2.2xlarge i went with 800GB)
  5. From aws console, attach the new volume to original mount point /dev/sda1
  6. From aws console, restart the instance

SSH in to the Instance

The command for getting the IP of all running instances is a little clunky so it can be useful to create and alias in the ~/.bashrc file:

alias aws_get_ip='aws ec2 describe-instances --query "Reservations[*].Instances[*].PublicIpAddress" --output=text'

Now, we can ssh in to the instance like so:

alias aws_get_ip='aws ec2 describe-instances --query "Reservations[*].Instances[*].PublicIpAddress" --output=text'

Install CUDA 7.5

sudo apt-get update && sudo apt-get -y upgrade
sudo apt-get -y install linux-headers-$(uname -r) linux-image-extra-`uname -r`
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_7.5-18_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1404_7.5-18_amd64.deb
rm cuda-repo-ubuntu1404_7.5-18_amd64.deb
sudo apt-get update
sudo apt-get install -y cuda

You should now reboot your machine

sudo reboot

Install cuDNN v4

Register and download the cuDNN v4 from here. You can then put it into your Google Drive folder and share the link:

wget https://www.dropbox.com/s/.../cudnn-7.5-linux-x64-v5.1.tgz
tar xvzf cudnn-7.5-linux-x64-v5.1.tgz
rm cudnn-7.5-linux-x64-v5.1.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include # move library files to /usr/local/cuda
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
rm -rf ~/cuda

Install Anaconda with python 3.5

wget http://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh
bash Anaconda3-4.2.0-Linux-x86_64.sh -b -p ~/bin/anaconda3
rm Anaconda3-4.2.0-Linux-x86_64.sh
echo 'export PATH="$HOME/bin/anaconda3/bin:$PATH"' >> ~/.bashrc

Setup tensorFlow

pip install tensorflow

Run a MNIST classifier and monitor the system usage

To finish the installation process, let’s run a MNIST classifier and monitor the system usage.

First, install the required packages:

sudo apt-get install htop
pip install gpustat
sudo nvidia-smi daemon  ## run daemon to make monitoring faster

Now start byobu (terminal multiplexer, similar to tmux or GNU screen):

byobu

Next, press Ctrl-F2 to split the window vertically and run htop:

htop

Press Shift-F2 to split the window horizontally and run the continous GPU monitor gpustat:

watch --color -n1.0 gpustat -cp

With Shift-<left> move to the left pannel, download the MNIST classification script and execute it:

wget https://raw.githubusercontent.com/tensorflow/models/master/tutorials/image/mnist/convolutional.py
python convolutional.py

Predicting Stock Prices with Machine Learning – Part 1- Introduction

This post is the first in a series that will explore the forecasting of stock prices using machine learning (ML) methods (for a quick intro to ML see my previous post). If you’ve spent any reasonable amount of time with me then you’ll know that I tend not to talk too kindly of papers that attempt to forecast equity prices using machine learning. However, here we are simply using equity price time-series as a great example of a non-stationary process.

Price prediction or, more generally, models for generating alpha are not the best use of ML in the quantitative trading process. Far from it. At some point I’ll dedicate a post solely to describing the architecture of a quantitative trading system but for now let’s just say that portfolio optimisation, the mixture of prediction models and algorithmic execution are tasks better suited to ML.

All the same, it is possible to forecast long term price movement with ML. This series of posts will be based on work for a paper that I published earlier this year called Automated trading with performance weighted random forests and seasonality where I demonstrated the power of the online generation of ML models and suggested a novel and highly successful way to combine the predictions of multiple models.

Before we get to the nitty-gritty of combining model outputs, we first need to cover some housekeeping essentials . Initially, we’ll look at the input data and how we turn this into useful features for our model. Without data, we’re nothing so this step is arguably our most important. Next we’ll go on to look out how to measure the performance of our prediction systems including a number of important and all-too-often forgotten metrics for understanding the long term success (or not) of our model. Finally we’ll get to the fun stuff and begin to train some ML models. We’ll start simple and add layers of complexity with associated justifications along the way.

I leave it at that for now. Below is a list of the posts to come and I’ll hyperlink the items as I get them written:

  • Part 2 – Data and features
  • Part 3 – Performance Metrics
  • Part 4 – Standard Methods
  • Part 5 – Ensemble Methods
  • Part 6 – Incorporating “online” performance weighting
  • Part 7 – Summary

What’s in store…

Firstly, apologies. It’s been a long time since I last posted but it’s been a hectic few months. This summer saw me moving house, changing jobs, writing up my PhD thesis and getting burgled so it’s been an interesting one to say the least.

I can’t deny that getting the thesis submitted was a struggle. Consolidating four years of research and experimentation that spanned a veritable plethora of academic disciplines* into a single book that no one will read was a mind mangling task that required hard-to-muster motivation. Chaos aside, I came across a great number of things that I plan to post about in the coming weeks and that’s exactly the subject of this post.

The next series of posts will concern the use of machine learning methods for stock picking. I know, I know, you don’t have tell me that’s a bad idea! You may have even read one of my previous posts that discouraged exactly this sort of tomfoolery. However, the objective of this work was not to build a Skynet type device that provides magical insight into the equity markets. As you will see,  this research uses equity price data to explore the best ways to produce stable predictions in non-stationary time-series using only simple modifications to well-documented machine learning methodologies.

After exploring prediction of non-stationary processes we’ll get a little more application specific and explore the use of these techniques for forecasting price impact of large equity orders using depth-of-book data but more on this when the time comes. For now, thanks for bearing with me and I hope you enjoy what’s to come.

* Artificial intelligence, machine learning, mathematical finance, agent-based modelling and complexity theory were the major players.

What is machine learning? A simple introduction

Even amongst practitioners, there is no truly well accepted definition for machine learning. So, I’ll to provide two:

  • Pioneer machine learning researcher Arthur Samuel defined machine learning as: “the field of study that gives computers the ability to learn without being explicitly programmed”. This definition is beautiful in its simplicity though lacks a little formality. So, with a little more structure,..
  • Tom Mitchell states that “a computer program is said to learn from experience E, with respect to some task T, and some performance measure P, if its performance on T as measured by P improves with experience E”.

Let’s reinforce the definitions with an example. A classic practical application is the email spam filter. The email program watches which emails the user does or does not mark as spam and, based on that, learns how to better filter future spam automatically. In the parlance of Tim Mitchell’s definition, classifying the emails as spam or not span is the task, T, watching the user label emails as spam or not spam is the experience, E, and the fraction emails correctly classified could be the perform measure, P.

There are a great number of machine learning algorithms and, as such, they are often divided into three main types: supervised, unsupervised and reinforcement learning algorithms.

Supervised Learning – machine learning with labels

Before providing a definition, let’s start with an example. Imagine you want to predict the stopping distance for cars given the speed that the car is travelling. The graph below shows some data from the “cars” dataset in R.

distance vs speed

A supervised learning algorithm would allow us to use the data available to make a general rule for making  predictions about future distances for speeds that we have not yet witnessed. In our two-variable example, this is the well-know task of fitting a line to the data. Eyeballing the data, we could conceivably fit a linear model (red) or a polynomial model (blue) shown below.

distance vs speed LINdistance vs speed POL

Fitting these models is an example of a supervised learning algorithm. The term supervised learning refers to the fact the algorithm requires a dataset for training that contain the “right” answers. That is to say, in our example, for every datapoint on the cars speed, we also had the corresponding data for the actual stopping distance.

The cars example is also a case of a regression problem, where we are predicting a continuous valued output (the distance).

Another type of supervised learning task is classification. Again, let’s set the scene with an example.

class

The figure above shows data from the well-known iris database. It shows a scatter plot of the sepal length vs. petal length of a number of iris plants. The points are coloured by species. Here, the machine learning task is to predict the species given new petal and sepal measurements. Which species would you label the new data-point in black? What makes this a classification task is that the variable to be predicted (species) is discrete valued.

 Unsupervised Learning – machine learning without labels

With unsupervised learning, the data contain no labels and the machine learning algorithm is tasked with finding structure in the data.

One very common type of unsupervised learning is known as clustering and is used to for categorisation of google news. Each day, google algorithms crawl the web for news stories and use clustering algorithms to group similar stories together. Other pertinent examples of clustering include: organising computer clusters, social network analysis and market segmentation.

There are a number of other unsupervised learning algorithms and indeed a number of other types of machine learning than we have not touched upon in this post. If you’ve found this page interesting and have been inspired to leaner more, I recommend the following books:

Also, for some great advice on the practical application of machine learning methods as well as a detailed derivation of some common algorithms, I strongly recommend Andrew Ng’s Coursera course .

Streamlining your LaTex workflow with LaTexMK

It’s been a *long* time since I last wrote a post and that’s mainly because I’ve been frantically writing my thesis.

For scientific writing I’m a big fan of LaTex for a number of reasons: dealing with and formatting mathematical notation is a joy, the implicit handling of intra-document references and bibliography makes life a lot easier and the separation of content and formatting helps me better focus on my writing.

That said, I do like to preview changes that I make and I’ve always found the workflow a little clunky. Having to save the tex file, head to the terminal and enter:

pdflatex mytexfile
bibtex mytexfile
pdflatex mytexfile
pdflatex mytexfile

Before opening up a PDF viewer to take a look at the result. It’s tedious but fear not, I’ve since grown tired and (like a good little computer scientist) managed to automate the process using a tool called latexmk to dramatically improve my latex workflow.

Latexmk

Effectively what latexmk does is watch your tex source file for changes and then run whatever it needs to in order to update your PDF automatically. Now, I simply open up a tex file and start latexmk in a Terminal window. Each time I save the source file, latexmk automatically runs in the background and  opens (or simply updates)  the PDF. This way, I never have to leave my tex editor nor manually run latex at all. I just save the file. On Mac OS X, I can even scroll through the PDF without removing focus from the tex editor.

Getting set up

I’ve explained how I set things up below but beware that this is specific to Mac OS X.

  • Download and install latexmk (but be aware it may have been included with your Tex distribution – run ‘which latexmk’ in the terminal to check).
  • Next, create a latexmk config file in your home directory, ~/.latexmkrc, and add the following lines:
    $pdf_previewer = "open -a /Applications/Skim.app";
    $clean_ext = "paux lox pdfsync out";
    

    Obviously, you can use any PDF viewer but I strongly recommend skim.

  • If, like me,  you mostly use latexmk with pdf files, you can add the following to your ~/.bash_profile:
    alias latexmk='latexmk.pl -pdf -pvc'
    

    You can always run latexmk.pl with other options if you need to.

Once you’re done, just ‘cd’ to the directory containing your latex source file, and run “mklatex myfile.tex”. Now you just leave it running while you work and it’ll take care of things automatically! Job done.

 

Limit Order Books – An introduction

This is an extract from a draft version of my PhD thesis:

“…For many years, the majority of the worlds financial markets have been driven by a style of auction, very similar to the basic process of haggling, known as the continuous double auction (CDA). In a CDA a seller may announce an offer or accept a bid at any time and a buyer may announce a bid or accept an offer at any time. This continuous and asynchronous process does away with any need for a centralised auctioneer, but does need a system for recording bids and offers and clearing trades. In modern financial markets, this function is performed by a uniform trading protocol known as the limit order book (LOB), whose universal adoption was a major factor in the transformation of financial exchanges.

The most common type of order submitted to a LOB is the limit order – an instruction to buy or sell a given quantity of an asset, that species a limit (worst acceptable) price which cannot be surpassed. Upon receiving a limit order, the exchange’s matching engine compares the order’s price and quantity with opposing orders from the book. If there is is a book order that matches the incoming order then a trade is executed. The new order is termed aggressive (or marketable) because it initiated the trade, while the existing order from the book is deemed passive. If, on the other hand, the there are no matches for the incoming order it is placed in the book along with the other unmatched orders, waiting for an opposing aggressive order to arrive (or until it is cancelled). A visualisation of the structure and mechanism of a LOB is given in Figure 1.1.

limit order book

Figure 1.1: An illustration of LOB structure and dynamics.

The details of oder matching vary across exchanges and assets classes. However, most modern equity markets operate using a price-time priority protocol. That is, the lowest offers and highest bids are considered first, while orders of the same price are differentiated by the time they arrive (with priority given to orders that arrive first). Thus, limit orders with identical prices form a first-in first-out (FIFO) queues.

Most LOB-driven exchanges offer many more order types than the simple limit order. Another particularly common order type is the market order, which ensures a trade executes immediately at the best available price for a given quantity. As a result, market orders demand liquidity and risk uncertainty. Many more order types are available that allow control over whether an order may be partially filled, when an order should become active and how visible the order is. Such order types include: conditional orders, hybrid orders, iceberg orders, stop orders and pegged orders, but the intricacies of these order types are beyond the scope of this report.

Traders interact with a screen-based LOB that summarises all of the “live” (outstanding)
bid and offers that have not yet been cancelled or matched. The LOB has two sides: the
ask book and the bid book. The ask book contains the prices of all outstanding asks, along with the quantity available at each price level, in ascending order. The bid book, on the other hand shows the corresponding information for bids but in descending order; this way traders see the “best” prices at the top of both books. A simplified example of what a trader may see when looking at a LOB is given below.

lob_example

A simplified example of how a trader would view the LOB shown in Figure 1.1.

The amount of information available about the LOB at any given time depends on the needs and resources of the traders. Usually the only information that is publicly available (in real time) is the last traded price or the mid-price (the point between the current best prices). Professional traders may chose to subscribe to receive information on the price and size for the best prices, along with the price and size of the last recorded transaction, of an asset of interest; this is known as “level 1” market data. The most informative information, “level 2” or “market depth” data, includes the complete contents of the book (except for certain types of hidden orders) but this comes at a premium. For individual subscribers, the current cost of receiving real time level 2 data for equities from just the New York Stock Exchange (NYSE) exchange is $5000/month.

At first glance, the rules or limit order trading seem simple but trading in a LOB is a highly complex optimisation problem. Traders may submit buy and/or sell orders at different times, prices, quantities and – in today’s highly fragmented markets – often to multiple order books. Order may also be modi ed or cancelled at any time. The complexity of LOB strategies presents significant challenges for those attempting to model, understand and predicts behaviours. Nonetheless, the well-de ned framework and the vast volumes of data generated by the use of LOBs presents an exciting and value opportunity for computational modeling…”

What happens inside a quantitative hedge fund?

quantitative hedge fund

Those of us working in academia have a tendency to try and do everything ourselves, often reinventing the wheel in the process. That said, due to the inherent secrecy of the quantitative hedge fund industry, academic researchers in automated trading often do need to do everything themselves. That is: modelling, algorithm design, optimisation, programming, backtesting and simulation.

But how does it work in a quantitative hedge fund? Well, from my (somewhat limited) experience with such funds I’ve found the following general structure to be fairly common.

At the beginning of the investment process you have the quant strategists/researchers. These guys and girls tend to have a PhDs in maths or physics and have a very strong understanding of  probability theory and statistics. Their job is to generate models and ideas that systematically capture investment opportunities and to develop algorithms based on these ideas. These algorithms are usually backtested by the quants and/or a research team.

Once an algorithm is deemed viable, it is passed on to the quant. developers. These developers are highly skilled (often PhD) programmers that focus on optimising the code for speed and robustness before passing it on to the programmers. The programmers’ job lies in interfacing the code with the trading platform and connectivity with the exchanges.

Next in line are the traders who work on unleashing the algorithms into the market at specific times  dependant on prevailing market conditions and instruction from the portfolio manger/s. In quantitative funds this can be quite a low touch job with traders simply launching and babysitting algorithms. However, there are often hundreds of algorithms running at once with traders in charge of allocating capital between them (according to some framework laid out by the portfolio managers).

Naturally, this tends to be more of a circular process. Given that the traders are in direct contact with the markets, they will often will come up with trading ideas and will talk with the quants who will explore, refine and generate algorithms from the ideas as the process starts again.

So you want to predict the stock market…

At least once a week, a Google Scholar Alert pops into my inbox featuring another bored academic that though they’d try using their “expertise” to “predict the stock market”. Said paper tends to follow this structure:

  • Starts with a paragraph about how humans are obsolete and that the market is being run by computers.
  • Next comes the ‘novel’ machine learning technique, which is really just a well known algo. (SVM, neural net., logistic regression) with an adaptive learning rate.
  • Some (usually 5) simple technical analysis indicators are used on unprocessed daily price data to create some features for the new super-algo.
  • Said algorithm is applied to predict whether price(t+1) > price(t) or vice versa.
  • Results report prediction accuracy of ~65% (And we know they tried 100 different stocks before settling on the 3 that they reported results for).
  • There are usually no out of sample results at all!

Some will stop here, reporting that they’ve cracked it and that their technique beats all others. If we’re lucky, however, they’ll go on to assume that they can trade at the close price of each day, buying when they predict upward movement and selling before a fall with a constant size strategy, reporting annual returns of 40%.

Sigh…

Putting aside the lack of out of sample results and non-existant estimation of transaction costs, they’re really missing the point. Attempting to predict the price return over the next 24 hours is not the best way to go about using machine learning to create investment opportunities.

Predicting the direction of price movement is certainly an essential part of automated trading models. But it’s not the only part. A simple automated trading model is better based around event detection: bull, bear or neutral markets, insider trading or institutional trading. Detecting such schemes can easily be approached with either binary classification of hypothesis testing.

In binary classification, a feature matrix, X of size (N*K), is used to predict a binary vector, Y of size N. In this case we would interpret each row as representing one day with K features that we hope relate in some way to the occurrence of the market event that we are trying to predict, Y. Once in this format, binary classification is a simple matter of applying one of the many machine learning algorithms to map each row, Xi to Yi. If our out of sample predictions are consistent, we have a means of quickly identifying regimes and adjusting our trading strategy accordingly. However, given that this kind of binary classification is a standard machine learning problem, it suffers all the pitfalls of overfitting.

For hypothesis testing, we make an assumption about the distribution of an observable variable before applying a test for low probability events. As an example, let’s say we’re into high-frequency volatility trading and we’re looking for unusual activity in a stock. We can describe the number of orders arriving per second with a Poisson distribution, X, and the individual order volume * price with a Gamma distribution, Y. Now we can fit X and Y through maximum likelihood estimation and calculate the probability, p, of the market acting “as normal” each second. Taking p<0.05 as a rare event, if we see more than 5 such events in a 100 second period we might be tempted to make a move. It’s important to bare in mind, however, that hypothesis testing assumes structure in data and thus requires a stationary distribution for decent results.

These kinds of techniques produce far more stable predictions than forecasting daily returns and provide us with information upon which it is much easier to trade.

Food For Thought

Like many people, I’m sure, when I get home after a long day’s work I often find it hard to switch off. It’s hardly surprising that after nine hours or so of programming, pondering and problem solving we find it hard to relax.

So what can we do? Well, should I look up at the calendar and find it to be a Friday, I’d say it’s time to grab the wetsuit and catch a train south-westerly for a bit of spearfishing, surfing or sun-seeking.  Alas, it rarely is. For the rest of the week I’ve turned to cooking to blow away the post-work stress.

For me, cooking is the perfect remedy for an over-active mind. It provides just enough stimulation to distract me from any thought of work without being overly taxing (most of the time). The concentration required to keep timings in mind; remember a couple of techniques; and manage not to set the kitchen on fire is just enough to clear my mind.

More than that, it’s great to be able to sit down to a tasty plate of freshly prepared food of an evening. Something increasingly easy to miss out on in this fast-paced, fast-food world. Here’s a sample of my culinary endeavours from this week…

c7624df7-c5a7-4f6d-b908-bdfa0d42609bwallpaper

Trip to Mupe Bay

This is an incredibly belated post but a friend recently posted some pictures of the trip and it brought back such great memories I felt I had to write about it!

So back in September, along with a few good friends, I took  trip down to Mupe Bay to escape work and get a final session of spearfishing in before the season ended. And by gosh it was beautiful.

In case you don’t know, Mupe bay is about a half hour walk east from Lulworth cove (which is about 8km SW of Poole).

It’s a beautiful sheltered cove, perfect for an overnight camp or mooring your boat.

We arrived just before sunset, set up camp and got the fire going. After a long night of jollity, laughs and tales of the sea, you can’t beat a sunrise swim in the bay.

That morning, a couple of us grabbed the spearguns and tried to catch some breakfast. What a great spot, underwater rock faces at about 5m not far along the cove.

Unfortunately, the wind picked up and the visibility was a little poor so no tasty bass for us but on better day I’m sure they’d be about. All in all a fantastic trip, will definitely be back to deplete the fish stocks when the weather improves. Stunning.