A simple convolutional neural network

In perious post we learned how to load the MNIST dataset and how to build a simple perceptron multilayer model, and now it is time to develop a more complex convolutional neural network. In this tutorial we will create a simple convolutional neural network for MNIST, which will demonstrate how to use all aspects of the current CNN implementation.

The first step is to import the necessary classes and functions.

import numpy
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.utils import np_utils
from keras import backend as K
K.set_image_dim_ordering('th')

Next, we initialize the random number generator to a constant initial value for reproducible results.

seed = 7
numpy.random.seed(seed)

Then we need to load the MNIST dataset and modify it to be suitable for CNN training.

# load data
(X_tr, Y_tr), (X_tst, y_tst) = mnist.load_data()
X_tr = X_tr.reshape(X_tr.shape[0], 1, 28, 28).astype('float32')
X_tst = X_tst.reshape(X_tst.shape[0], 1, 28, 28).astype('float32')
# normalize inputs from 0-255 to 0-1
X_tr = X_tr / 255
X_tst = X_tst / 255
# one hot encode outputs
Y_tr = np_utils.to_categorical(Y_tr)
y_tst = np_utils.to_categorical(y_tst)
num_classes = y_tst.shape[1]

Then we define the neural network model. Convolutional neural networks are more complex than standard multilayer perceptrons, so we will start by using a simple structure.

Let’s use the following network architecture:

  1. The first hidden layer is the convolutional layer, Convolution2D. This layer has 32 maps, the size of which is 5 × 5 and the activation function relu.
  2. Then we define the maxPooling2D pooling layer with a pool size of 2 × 2, which gives maximum values.
  3. The next level is the level of regularization  Dropout. It is configured to randomly exclude 20% of the neurons in the layer in order to reduce overtraining.
  4. Next is the layer that converts the data of a two-dimensional matrix into a vector, called Flatten. It allows you to handle the output of standard fully connected layers.
  5. Then a fully connected layer with 128 neurons with the relu activation function.
  6. Finally, the output layer of 10 neurons for 10 classes with softmax activation function for displaying probabilistic classification results for each class.
def create_model():
	# create model
	m = Sequential()
	m.add(Conv2D(32, (5, 5), input_shape=(1, 28, 28), activation='relu'))
	m.add(MaxPooling2D(pool_size=(2, 2)))
	m.add(Dropout(0.2))
	m.add(Flatten())
	m.add(Dense(128, activation='relu'))
	m.add(Dense(num_classes, activation='softmax'))
	# Compile model
	m.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
	return m

As in the example with multilayer perceptron, this model has 10 epochs of training, with each update of the weights 200 images are used.

# build the model
modelConv = create_model()
# Fit the model
modelConv.fit(X_tr, Y_tr, validation_data=(X_tst, y_tst), epochs=10, batch_size=200, verbose=2)
# Final evaluation of the model
scores = modelConv.evaluate(X_tst, y_tst, verbose=0)
print("CNN Error: %.2f%%" % (100-scores[1]*100))

The accuracy of the classification of the model is printed in each epoch of training, and at the end of a classification error is printed.

Train on 60000 samples, validate on 10000 samples.

Epoch 1/10  - 276s - loss: 0.2224 - acc: 0.9366 - val_loss: 0.0783 - val_acc: 0.9754
Epoch 2/10  - 279s - loss: 0.0710 - acc: 0.9789 - val_loss: 0.0454 - val_acc: 0.9846
Epoch 3/10 - 449s - loss: 0.0510 - acc: 0.9841 - val_loss: 0.0444 - val_acc: 0.9854
Epoch 4/10 - 267s - loss: 0.0389 - acc: 0.9881 - val_loss: 0.0403 - val_acc: 0.9876
Epoch 5/10 - 269s - loss: 0.0325 - acc: 0.9898 - val_loss: 0.0349 - val_acc: 0.9883
Epoch 6/10 - 313s - loss: 0.0267 - acc: 0.9919 - val_loss: 0.0321 - val_acc: 0.9896
Epoch 7/10  - 255s - loss: 0.0220 - acc: 0.9930 - val_loss: 0.0339 - val_acc: 0.9888
Epoch 8/10 - 271s - loss: 0.0192 - acc: 0.9939 - val_loss: 0.0329 - val_acc: 0.9896
Epoch 9/10 - 266s - loss: 0.0157 - acc: 0.9951 - val_loss: 0.0323 - val_acc: 0.9891
Epoch 10/10 - 279s - loss: 0.0145 - acc: 0.9956 - val_loss: 0.0333 - val_acc: 0.9889
CNN Error: 1.11%

Training a convolutional neural network takes more time than training a simple perceptron. However, the error reaches 1.11%, which is significantly less compared to the perceptron.

The complete source code for this example: https://github.com/fgafarov/learn-neural-networks/blob/master/MNIST_convolutional_simple.py

Big convolutional neural network

Now that we have seen how to create a simple convolutional neural network, let’s create a model that can be close to the latest scientific results.  At the beginning of the program we import classes and functions, then we load and prepare data in the same way as in the previous CNN example.

import numpy
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.utils import np_utils
from keras import backend as K
K.set_image_dim_ordering('th')
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load data
(X_tr, y_tr), (X_tst, y_tst) = mnist.load_data()
# reshape to be [samples][pixels][width][height]
X_tr = X_tr.reshape(X_tr.shape[0], 1, 28, 28).astype('float32')
X_tst = X_tst.reshape(X_tst.shape[0], 1, 28, 28).astype('float32')
# normalize inputs from 0-255 to 0-1
X_tr = X_tr / 255
X_tst = X_tst / 255
# one hot encode outputs
y_tr = np_utils.to_categorical(y_tr)
y_tst = np_utils.to_categorical(y_tst)
num_classes = y_tst.shape[1]

Now we will create a large convolutional neural network architecture with additional convolutional layers, polling layers, and fully interconnected layers. The network topology can be summarized as follows:

  1. Conv2D convolutional layer with 30 5 × 5 functional maps.
  2. A layer of maximal pooling MaxPooling2D of size 2 * 2.
  3. Conv2D convolutional layer with 15 3 × 3 picture maps.
  4. A layer of maximal pooling MaxPooling2D of size 2 * 2.
  5. Dropout layer with a probability of 20%.
  6. Flatten layer.
  7. Dense layer with 128 neurons and relu activation function.
  8. Dense  layer with 50 neurons and relu activation function.
  9. Dense output layer with softmax activation function.
# define the larger model
def create_model():
	# create model
	m = Sequential()
	m.add(Conv2D(30, (5, 5), input_shape=(1, 28, 28), activation='relu'))
	m.add(MaxPooling2D(pool_size=(2, 2)))
	m.add(Conv2D(15, (3, 3), activation='relu'))
	m.add(MaxPooling2D(pool_size=(2, 2)))
	m.add(Dropout(0.2))
	m.add(Flatten())
	m.add(Dense(128, activation='relu'))
	m.add(Dense(50, activation='relu'))
	m.add(Dense(num_classes, activation='softmax'))
	# Compile model
	m.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
	return m
# build the model
modelConvBig = create_model()
# Fit the model
modelConvBig.fit(X_tr, y_tr, validation_data=(X_tst, y_tst), epochs=10, batch_size=200,verbose=2)
# Final evaluation of the model
scores = modelConvBig.evaluate(X_tst, y_tst, verbose=0)
print("Large CNN Error: %.2f%%" % (100-scores[1]*100))

This model already reaches the error level of 0.89%.

The complete source code for this example: https://github.com/fgafarov/learn-neural-networks/blob/master/MNIST_convolutional_large.py