In this tutoral we will discuss about mathematical basis of single-layer neural network training methods.

Gradient descent method

Gradient descent method is a method of finding a local extremum (minimum or maximum) of a function by moving along a gradient of error function.

According to gradient descent method the weights and thresholds of neurons calculated by the formulas:

(1)

(2)

Here, is the error function, and is the learning rate of training algorithm.

Delta rule

Delta rule also called  Widrow-Hoff’s learning rule was introduced by Bernard Widrow and Marcian Hoff, to minimize the error over all training patterns.  It implies the minimization of the root-mean-square error of the neural network, determined by the formula: , where d- is a target value.

Each neuron calculates a weighted sum of its inputs according to the formula: . If the linear activation function is used, then the error functional will be equal to:

(3)

The derivatives of the error function by weighs ang threshold expressed as :

(4)

(5)

(6)

During the training process the weights and the thresholds of the neuron calculated by the formulas:

(7)

(8)

Consider a neural network consisting of one layer with three neurons

Here – is input vector and – target vector.

In this case, the error function will be equal to
and weights and biases of neurons calculated by the formulas:

(9)

(10)

Single-layer neural network training algorithm

The sequence of steps for training single-layer neural network by using Widrow-Hoff learning rule:

1. Specify the learning step α (0 <α <1) and the desired root-mean-square error of the network .
2. Initialize the weighting coefficients and the threshold values  ​​of neurons by random numbers.
3. Feed vectors from the training sample to the input of the neural network. Calculate the output values ​​of the neurons.
4. Change the weight coefficients and thresholds of neural elements according to formulas (9,10).
5. Calculate the total error of the neural network
6. If , then go to step 3, otherwise stop the execution of algorithm.