In this tutorial a sequence classification problem by using long short term memory networks and Keras is considered. Classification of sequences is a predictive modelling problem, in which you have a certain sequence of entries, and the task is to predict the category for the sequence. The difficulty of this problem lies in the fact that sequences can vary in length, consist of a large vocabulary of input characters, and may require the model to study the long-term context or dependencies between characters in the input sequence. In this section, we will develop an LSTM recurrent neural network model for sequence classification problems.
Here we will use LSTM neural network for classification imdb film reviews. The imdb dataset contains 25,000 high polar film reviews (good or bad) for training and the some amount for testing. Keras contains the imdb.load_data () function, which allows you to load a dataset in a format that is ready for use in a neural network. In loaded dataset the words replaced with integers that indicate the ordered frequency of each word in the data set, therefore, the sentences in each review consist of a sequence of integers.
We will map each word to a 32-digit real vector and limit the total number of words that we in modelling to of 5,000 most frequently used words. And since the length of the sequence (the number of words) in each review changes, we will limit each review to 500 words, truncating long reviews and filling shorter reviews with zero values.
As usual we begin by importing the classes and functions required for this model, and initializing the random number generator so that we can reproduce the results.
import numpy from keras.datasets import imdb from keras.models import Sequential from keras.layers import Dense from keras.layers import LSTM from keras.layers.embeddings import embedding from keras.preprocessing import sequence numpy.random.seed(7)
Next, we need to load the imdb dataset. We limit the data set to 5000 words. We also divided the data set into a train (50%) and a test (50%) datasets.
# load the dataset but only keep the top n words, zero the rest top_w = 5000 (X_tr, y_tr), (X_tst, y_tst) = imdb.load_data(num_words=top_w)
Then we need to truncate or fill by zeros the input sequences so that they are the same length for learning the neural network.
# truncate and pad input sequences max_review_length = 500 X_tr = sequence.pad_sequences(X_tr, maxlen=max_review_length) X_tst = sequence.pad_sequences(X_tst, maxlen=max_review_length)
Next we can create, compile and train our LSTM neural network model. The first layer is the embedding layer, which uses 32-component vectors to represent each word. The next layer is an LSTM layer containing 100 neurons. Finally, since this is a classification problem, we use a fully connected (Dense) output layer with one neuron and the sigmoid activation function to get 0 or 1 at the output to predict two classes (good and bad).
Since this is a binary classification problem, we use the logarithmic error function (binary_crossentropy) and ADAM optimization algorithm. The model is trained for 3 epochs.
# create the model emb_vecor_length = 32 modelClass = Sequential() modelClass.add(embedding(top_w, emb_vecor_length, input_length=max_review_length)) modelClass.add(LSTM(100)) modelClass.add(Dense(1, activation='sigmoid')) modelClass.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) print(modelClass.summary()) modelClass.fit(X_tr, y_tr, epochs=3, batch_size=64)
After training the network, we evaluate the quality of the model by using test dataset.
# Final evaluation of the model scores = modelClass.evaluate(X_tst, y_tst, verbose=0) print("Accuracy: %.2f%%" % (scores*100))
Running this example leads to the following result.
Epoch 1/3 16750/16750 [===========] - 107s - loss: 0.5570 - acc: 0.7149 Epoch 2/3 16750/16750 [==========] - 107s - loss: 0.3530 - acc: 0.8577 Epoch 3/3 16750/16750 [==========] - 107s - loss: 0.2559 - acc: 0.9019 Accuracy: 86.79%