Time series forecasting tasks are a complex type of predictive modelling problem. In contrast to regression predictive modelling, time series also add the complexity of the sequence to input variables. A powerful type of neural network designed to process sequences are recurrent neural networks. A network with a long short memory or LSTM network is a type of recurrent neural network used in deep learning.

Here we will develop the LSTM neural networks for the standard time series prediction problem. These examples will help you develop your own structured LSTM networks for time series forecasting tasks. The task that we will consider is the problem of forecasting passenger traffic. The task is to predict the number of passengers of international airlines. The data set is available for free from the address https://datamarket.com/data/set/22u3/international-airline-passengers-monthly-totals-in-thousands-jan-49-dec-60#!ds=22u3&display=line with the file name “international-airlines-passengers.csv”. Data range is from January 1949 to December 1960 or 12 years with 144 observations. We can load this dataset using standart Pandas library. We are not interested on the date beacuse each observation is divided by the same interval of one month. Therefore, when we load a dataset we can exclude the first column. After downloading, we can easily build the entire data set. The code for loading and building a dataset is shown below.

import pandas
import matplotlib.pyplot as plt
dataset = pandas.read_csv('international-airline-passengers.csv', usecols=[1], engine='python', skipfooter=3)
plt.plot(dataset)
plt.show()

Уou can see an uptrend in the dataset and some periodicity, which is probably corresponds to the vacation period in the northern hemisphere.

We can formulate this problem as a regression problem. It is given the number of passengers (in thousands of units) for this month, the task is to predict the number of passengers next month. We can write a simple function to convert our single-column dataset into a two-column data set: the first column containing the number of passengers and the second column, which will contain the number of passengers next month.

Before we begin, let’s first import all functions and classes that we intend to use.

import NumPy
import matplotlib.pyplot as plt
from pandas import read_csv
import math
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error

Before we do anything, we initialize a random number generator to ensure that our results are reproducible.

NumPy.random.seed (7)

Use the code from the previous section to load the dataset as Pandas dataframe. Then we can extract the NumPy array from the data frame and convert the integer values to floating point values that are more suitable for working with the neural networks.

df = read_csv('international-airline-passengers.csv', usecols=[1], engine='python', skipfooter=3)
ds = df.values
ds = ds.astype('float32')

LSTMs are sensitive to the input data scale, especially when sigmoid (default) or tanh activation functions are used. Therefore, it is necessary to scale the data to a range from 0 to 1, also called normalization. We can easily normalize the dataset by using the MinMaxScaler preprocessing class from the scikit-learn library.

scaler = MinMaxScaler(feature_range=(0, 1))
ds = scaler.fit_transform(ds)

After we model our data and evaluate the quality of our model on the training data set, we need to know how accurate the forecast for dataset that the network has not seen. For the usual classification or regression task, we will do this using cross-validation. When using time series, the sequence of values ​​is important. A simple method that we can use is to divide the ordered data set into training and test data sets. The code below divides the whole dataset into training dataset with 67% of the observations that we can use to train our model, leaving 33% for testing the model.

tr_size = int(len(ds) * 0.67)
tst_size = len(ds) - tr_size
tr, tst = ds[0:tr_size,:], ds[tr_size:len(ds),:]

Now we define a function to create a new dataset, as described above. The function takes two arguments: a dataset, which is an NumPy array that we want to convert to a dataset, and look_back,  a pparameter which represents the number of previous time steps to use as input variables to predict the next time period — in this case, the default value is 1. This default value will create a data set, where X is the number of passengers at a given time (t), and Y is the number of passengers at the next time point (t + 1).

def prepare_data(dataset, look_back=1):
	dX, dY = [], []
	for i in range(len(dataset)-look_back-1):
		a = dataset[i:(i+look_back), 0]
		dX.append(a)
		dY.append(dataset[i + look_back, 0])
	return NumPy.array(dX), NumPy.array(dY)

Next, we use this function to prepare the data sets for training and to test the neural network and transform the data into a structure corresponding to the input of the neural network.

# reshape into X=t and Y=t+1
look_back = 1
trX, trY = prepare_data(tr, look_back)
tstX, tstY = prepare_data(tst, look_back)
# reshape input to be [samples, time steps, features]
trX = NumPy.reshape(trX, (trX.shape[0], 1, trX.shape[1]))
tstX = NumPy.reshape(tstX, (tstX.shape[0], 1, tstX.shape[1]))

Next, we will develop and configure our LSTM network to solve this problem. Neural network contains a layer with 4 LSTM blocks of neurons and the output layer, which at the output gives one value. For LSTM neurons, the sigmoid activation function is used by default and the network is trained for 100 epochs.

modelPred = Sequential()
modelPred.add(LSTM(4, input_shape=(1, look_back)))
modelPred.add(Dense(1))
modelPred.compile(loss='mean_squared_error', optimizer='adam')
modelPred.fit(trX, trY, epochs=100, batch_size=1, verbose=2)

After training of the model, we can assess the quality of the training model and on the test data sets. Note that we invert the predictions before calculating errors to ensure that the result are displayed in the same units as the original data (thousands of passengers per month).

# make predictions
trPredict = modelPred.predict(trX)
tstPredict = modelPred.predict(tstX)
# invert predictions
trPredict = scaler.inverse_transform(trPredict)
trY = scaler.inverse_transform([trY])
tstPredict = scaler.inverse_transform(tstPredict)
tstY = scaler.inverse_transform([tstY])
# calculate root mean squared error
trScore = math.sqrt(mean_squared_error(trY[0], trPredict[:,0]))
print('Train Score: %.2f RMSE' % (trScore))
tstScore = math.sqrt(mean_squared_error(tstY[0], tstPredict[:,0]))
print('Test Score: %.2f RMSE' % (tstScore))

Finally, we can generate predictions by using training and test data in order to get a visual representation of the quality of the model. Because of the way the dataset was prepared, we have to shift the predictions so that they align on the x axis with the original dataset.

# shift train predictions for plotting
trPredictPlot = NumPy.empty_like(ds)
trPredictPlot[:, :] = NumPy.nan
trPredictPlot[look_back:len(trPredict)+look_back, :] = trPredict
# shift test predictions for plotting
tstPredictPlot = NumPy.empty_like(ds)
tstPredictPlot[:, :] = NumPy.nan
tstPredictPlot[len(trPredict)+(look_back*2)+1:len(ds)-1, :] = tstPredict
# plot baseline and predictions
plt.plot(scaler.inverse_transform(ds))
plt.plot(trPredictPlot)
plt.plot(tstPredictPlot)
plt.show()

The inintial dataset is displayed in blue color, the predictions for the training in green and the predictions for the test set in red color. We see that the model has perfectly coped with the forecast both on the training and on the test data set.

In the console, the program issued the following information about training results:

Train Score: 22.93 RMSE
Test Score: 47.53 RMSE

The complete source code for this tutorial: https://github.com/fgafarov/learn-neural-networks/blob/master/time_series_prediction_LSTM.py