Time series forecasting tasks are a complex type of predictive modelling problem. In contrast to regression predictive modelling, time series also add the complexity of the sequence to input variables. A powerful type of neural network designed to process sequences are recurrent neural networks. A network with a long short memory or LSTM network is a type of recurrent neural network used in deep learning.
Here we will develop the LSTM neural networks for the standard time series prediction problem. These examples will help you develop your own structured LSTM networks for time series forecasting tasks. The task that we will consider is the problem of forecasting passenger traffic. The task is to predict the number of passengers of international airlines. The data set is available for free from the address https://datamarket.com/data/set/22u3/international-airline-passengers-monthly-totals-in-thousands-jan-49-dec-60#!ds=22u3&display=line with the file name “international-airlines-passengers.csv”. Data range is from January 1949 to December 1960 or 12 years with 144 observations. We can load this dataset using standart Pandas library. We are not interested on the date beacuse each observation is divided by the same interval of one month. Therefore, when we load a dataset we can exclude the first column. After downloading, we can easily build the entire data set. The code for loading and building a dataset is shown below.
import pandas import matplotlib.pyplot as plt dataset = pandas.read_csv('international-airline-passengers.csv', usecols=, engine='python', skipfooter=3) plt.plot(dataset) plt.show()
Уou can see an uptrend in the dataset and some periodicity, which is probably corresponds to the vacation period in the northern hemisphere.
We can formulate this problem as a regression problem. It is given the number of passengers (in thousands of units) for this month, the task is to predict the number of passengers next month. We can write a simple function to convert our single-column dataset into a two-column data set: the first column containing the number of passengers and the second column, which will contain the number of passengers next month.
Before we begin, let’s first import all functions and classes that we intend to use.
import NumPy import matplotlib.pyplot as plt from pandas import read_csv import math from keras.models import Sequential from keras.layers import Dense from keras.layers import LSTM from sklearn.preprocessing import MinMaxScaler from sklearn.metrics import mean_squared_error
Before we do anything, we initialize a random number generator to ensure that our results are reproducible.
Use the code from the previous section to load the dataset as Pandas dataframe. Then we can extract the NumPy array from the data frame and convert the integer values to floating point values that are more suitable for working with the neural networks.
df = read_csv('international-airline-passengers.csv', usecols=, engine='python', skipfooter=3) ds = df.values ds = ds.astype('float32')
LSTMs are sensitive to the input data scale, especially when sigmoid (default) or tanh activation functions are used. Therefore, it is necessary to scale the data to a range from 0 to 1, also called normalization. We can easily normalize the dataset by using the MinMaxScaler preprocessing class from the scikit-learn library.
scaler = MinMaxScaler(feature_range=(0, 1)) ds = scaler.fit_transform(ds)
After we model our data and evaluate the quality of our model on the training data set, we need to know how accurate the forecast for dataset that the network has not seen. For the usual classification or regression task, we will do this using cross-validation. When using time series, the sequence of values is important. A simple method that we can use is to divide the ordered data set into training and test data sets. The code below divides the whole dataset into training dataset with 67% of the observations that we can use to train our model, leaving 33% for testing the model.
tr_size = int(len(ds) * 0.67) tst_size = len(ds) - tr_size tr, tst = ds[0:tr_size,:], ds[tr_size:len(ds),:]
Now we define a function to create a new dataset, as described above. The function takes two arguments: a dataset, which is an NumPy array that we want to convert to a dataset, and look_back, a pparameter which represents the number of previous time steps to use as input variables to predict the next time period — in this case, the default value is 1. This default value will create a data set, where X is the number of passengers at a given time (t), and Y is the number of passengers at the next time point (t + 1).
def prepare_data(dataset, look_back=1): dX, dY = ,  for i in range(len(dataset)-look_back-1): a = dataset[i:(i+look_back), 0] dX.append(a) dY.append(dataset[i + look_back, 0]) return NumPy.array(dX), NumPy.array(dY)
Next, we use this function to prepare the data sets for training and to test the neural network and transform the data into a structure corresponding to the input of the neural network.
# reshape into X=t and Y=t+1 look_back = 1 trX, trY = prepare_data(tr, look_back) tstX, tstY = prepare_data(tst, look_back) # reshape input to be [samples, time steps, features] trX = NumPy.reshape(trX, (trX.shape, 1, trX.shape)) tstX = NumPy.reshape(tstX, (tstX.shape, 1, tstX.shape))
Next, we will develop and configure our LSTM network to solve this problem. Neural network contains a layer with 4 LSTM blocks of neurons and the output layer, which at the output gives one value. For LSTM neurons, the sigmoid activation function is used by default and the network is trained for 100 epochs.
modelPred = Sequential() modelPred.add(LSTM(4, input_shape=(1, look_back))) modelPred.add(Dense(1)) modelPred.compile(loss='mean_squared_error', optimizer='adam') modelPred.fit(trX, trY, epochs=100, batch_size=1, verbose=2)
After training of the model, we can assess the quality of the training model and on the test data sets. Note that we invert the predictions before calculating errors to ensure that the result are displayed in the same units as the original data (thousands of passengers per month).
# make predictions trPredict = modelPred.predict(trX) tstPredict = modelPred.predict(tstX) # invert predictions trPredict = scaler.inverse_transform(trPredict) trY = scaler.inverse_transform([trY]) tstPredict = scaler.inverse_transform(tstPredict) tstY = scaler.inverse_transform([tstY]) # calculate root mean squared error trScore = math.sqrt(mean_squared_error(trY, trPredict[:,0])) print('Train Score: %.2f RMSE' % (trScore)) tstScore = math.sqrt(mean_squared_error(tstY, tstPredict[:,0])) print('Test Score: %.2f RMSE' % (tstScore))
Finally, we can generate predictions by using training and test data in order to get a visual representation of the quality of the model. Because of the way the dataset was prepared, we have to shift the predictions so that they align on the x axis with the original dataset.
# shift train predictions for plotting trPredictPlot = NumPy.empty_like(ds) trPredictPlot[:, :] = NumPy.nan trPredictPlot[look_back:len(trPredict)+look_back, :] = trPredict # shift test predictions for plotting tstPredictPlot = NumPy.empty_like(ds) tstPredictPlot[:, :] = NumPy.nan tstPredictPlot[len(trPredict)+(look_back*2)+1:len(ds)-1, :] = tstPredict # plot baseline and predictions plt.plot(scaler.inverse_transform(ds)) plt.plot(trPredictPlot) plt.plot(tstPredictPlot) plt.show()
The inintial dataset is displayed in blue color, the predictions for the training in green and the predictions for the test set in red color. We see that the model has perfectly coped with the forecast both on the training and on the test data set.
In the console, the program issued the following information about training results:
Train Score: 22.93 RMSE Test Score: 47.53 RMSE