Introduction
Neural Network is a buzz word in the industry nowadays. It sits at the heart of numerous AI applications. Neural network is used in Face recognition, Object Detection, Language Translation, Speech Recognition and in countless other areas.
The inspiration of Neural Network (or more preciously Artificial Neural Network) came from animal’s brain, specially from human brain. Human brain is a very complex system and we still know very less about it. So, it is not possible for us to mimic the human brain 100% and it is not required also. Building blocking of neural network is neurons. So, scientists started building Artificial Neurons. All these Artificial Neurons together forms Artificial Neural Network (ANN). Over the period ANN has become quite different from biological neurons.
I know all these sounds quite complicated but trust me, it is not so complex. So continue reading.
Biological Neuron
Our brain is composed on neurons. A neuron takes input using the dendrons which connect to synapses of previous neurons. The nucleus processes the signal’s that are accumulated from dendrites and if the signal crosses the threshold, the neuron fires. The output of the neuron flows through ‘Axon’ and it is available at ‘Synapses’. ‘Dendron’ of next neurons takes signal from these ‘Synapses’. The signals are electrical impulses. This signals helps us to learn and build the understanding.
Fig 1: Biological Neuron
Artificial Neuron
Mc Culloh & Pitt 1st tried to create a mathematical model of artificial neuron. This model solves Boolean decision related problems. In this model a neuron takes multiple inputs, accumulates them, and then produces the output.
If sum of the inputs is greater than some threshold (a) then it returns 1, else it returns 0.
Fig 2: Biological Neuron
Below are some advancements made on this above model:
1. Multiplying weights with the input. This gives the model more flexibility
2. Threshold can also be considered as another weight (w0)
This better and advanced model is called Perceptron Model.
Single Layer Perceptron
Fig 3: Perceptron Output
Fig 4: Perceptron
Steps inside perceptron:
1. Each input is multiplied with the weight.
2. All the weighted inputs are added together along with a bias b
3. This summation is passed through a function (f). This function is called activation function
If input of activation function crosses a threshold then the output will be 1 else, it will be 0. Perceptron is having only single layer; this is called Single Layer Perceptron. Weights decides the strength of the input. A bias value allows you to shift the activation function curve up or down. Perceptron classify the data into 2 parts. So, it is a linear binary classifier. People even come with training algorithm for the perceptron.
Perceptron Training Algorithm
· Initialize the weights randomly
· Pass the training data into perceptron. Calculate the gradient of the difference between the actual value and the computed value.
Fig 8
· Calculate new weight by simply incrementing our original weights with the computed gradients multiplied by the learning rate.
Fig 9
· Training is stopped when the error made by the model falls to a low level or no longer improves, or a maximum number of epochs (This training process is repeated for each row of data in the training dataset, called an epoch) is performed.
The Perceptron algorithm is available in the scikit-learn, Python machine learning library. The class allows you to configure the learning rate (eta0), which defaults to 1.0. It also allows you to configure the total number of training epochs (max_iter), which defaults to 1,000.
Import the library:
Fig 10: Library
Define the model with all default values:
Fig 11: Model
XOR Problem
Fig 12 : XOR Truth Table
Single layer perceptron is not able to solve this simple XOR problem. As this problem is not linearly separable. So, a perceptron cannot find decision boundary which will separate Class 0 and Class 1.
Fig 13: Decision Boundary for XOR
Inability to solve this simple problem was a very huge set back for the Perceptron. After this problem is faced, for many years development in neural network is suspended. This is called AI Winter.
Multi-layer perceptron (MLP)
As single layer of perceptron is not able to solve the non-linear functions. People started using multiple layers of computational units, usually interconnected in a feed-forward way. Each neuron in one layer has directed connections to the neurons of the subsequent layer. In many applications the units of these networks apply a sigmoid function as an activation function.
Conclusion
MLP introduces a lot of hope for the AI community. Back propagation algorithm came into picture which helped the application in learning the weights automatically. Many more different and better activation functions like tanh, ReLU etc. people started using. In my upcoming blogs I will go into more details of the multiple layer Neural Network and how it works.