By Derrick Mwiti, Data Analyst
Editor’s note: This tutorial illustrates how to get started forecasting time series with LSTM models. Stock market data is a great choice for this because it’s quite regular and widely available to everyone. Please don’t take this as financial advice or use it to make any trades of your own.
In this tutorial, we’ll build a Python deep learning model that will predict the future behavior of stock prices. We assume that the reader is familiar with the concepts of deep learning in Python, especially Long Short-Term Memory.
While predicting the actual price of a stock is an uphill climb, we can build a model that will predict whether the price will go up or down. The data and notebook used for this tutorial can be found here. It’s important to note that there are always other factors that affect the prices of stocks, such as the political atmosphere and the market. However, we won’t focus on those factors for this tutorial.
LSTMs are very powerful in sequence prediction problems because they’re able to store past information. This is important in our case because the previous price of a stock is crucial in predicting its future price.
We’ll kick of by importing NumPy for scientific computation, Matplotlib for plotting graphs, and Pandas to aide in loading and manipulating our datasets.
Loading the Dataset
The next step is to load in our training dataset and select the
Highcolumns that we’ll use in our modeling.
We check the head of our dataset to give us a glimpse into the kind of dataset we’re working with.
Open column is the starting price while the
Close column is the final price of a stock on a particular trading day. The
Low columns represent the highest and lowest prices for a certain day.
From previous experience with deep learning models, we know that we have to scale our data for optimal performance. In our case, we’ll use Scikit- Learn’s
MinMaxScaler and scale our dataset to numbers between zero and one.
Creating Data with Timesteps
LSTMs expect our data to be in a specific format, usually a 3D array. We start by creating data in 60 timesteps and converting it into an array using NumPy. Next, we convert the data into a 3D dimension array with
X_train samples, 60 timestamps, and one feature at each step.
Building the LSTM
In order to build the LSTM, we need to import a couple of modules from Keras:
Sequentialfor initializing the neural network
Densefor adding a densely connected neural network layer
LSTMfor adding the Long Short-Term Memory layer
Dropoutfor adding dropout layers that prevent overfitting
We add the LSTM layer and later add a few
Dropout layers to prevent overfitting. We add the LSTM layer with the following arguments:
- 50 units which is the dimensionality of the output space
return_sequences=Truewhich determines whether to return the last output in the output sequence, or the full sequence
input_shapeas the shape of our training set.
When defining the
Dropout layers, we specify 0.2, meaning that 20% of the layers will be dropped. Thereafter, we add the
Dense layer that specifies the output of 1 unit. After this, we compile our model using the popular adam optimizer and set the loss as the
mean_squarred_error. This will compute the mean of the squared errors. Next, we fit the model to run on 100 epochs with a batch size of 32. Keep in mind that, depending on the specs of your computer, this might take a few minutes to finish running.
Predicting Future Stock using the Test Set
First we need to import the test set that we’ll use to make our predictions on.
In order to predict future stock prices we need to do a couple of things after loading in the test set:
- Merge the training set and the test set on the 0 axis.
- Set the time step as 60 (as seen previously)
MinMaxScalerto transform the new dataset
- Reshape the dataset as done previously
After making the predictions we use
inverse_transform to get back the stock prices in normal readable format.
Plotting the Results
Finally, we use Matplotlib to visualize the result of the predicted stock price and the real stock price.
From the plot we can see that the real stock price went up while our model also predicted that the price of the stock will go up. This clearly shows how powerful LSTMs are for analyzing time series and sequential data.
There are a couple of other techniques of predicting stock prices such as moving averages, linear regression, K-Nearest Neighbours, ARIMA and Prophet. These are techniques that one can test on their own and compare their performance with the Keras LSTM. If you wish to learn more about Keras and deep learning you can find my articles on that here and here.
Discuss this post on Reddit and Hacker News.
Bio: Derrick Mwiti is a data analyst, a writer, and a mentor. He is driven by delivering great results in every task, and is a mentor at Lapid Leaders Africa.
Original. Reposted with permission.
- Introduction to Deep Learning with Keras
- Introduction to PyTorch for Deep Learning
- The Keras 4 Step Workflow