Artificial Neural Networks have been a critical driving force that led the way for deep learning implementation.
But what is an artificial neural network? And how can you start implementing them yourself using python?
Fear not! All will become clear soon.
But before we get stuck into artificial neural networks, let’s first talk about the brain and how it has inspired this type of deep learning.
What is a neural network?
Now, this is going to be a very stripped back version of what I can remember from A Level Biology, so bear with me.
The brain uses a network of neurons to pass messages around the body. These neurons are reactions to stimuli and help the body to understand what is going on around you.
This network, coupled with the brain itself as a processor, teaches you about the world.
The neural network allows you to learn.
Now that you understand a very (very) basic area of neuroscience, you can begin to see some of what an artificial neural network might do.
What is an Artificial Neural Network?
The simplest possible way of defining an artificial neural network is a system of calculations and feedback loops designed to allow a computer to mimic some of the processes used by the human brain to learn.
But what does that mean in practical terms for you, the machine learning engineer?
Well, let me help out here.
In an artificial neural network, you are creating a network (well duh…) of different calculations that stimulate different ‘neurons’ within a hidden layer.
Ok ok, I’m getting a bit technical here. Let me help you out with a diagram.
How is an Artificial Neural Network structured?
So there are three main parts of an artificial neural network:
- Input layer
- Hidden Layer (where the neurons live)
- Output layer
The input layer is formed of your input vector, which is the features relating to your input variables.
The output layer is the result, or output, of your model.
Then we have the exciting part, the hidden layer. This is where the neurons live.
Understanding how the hidden layer in an ANN works
The input layer sends a signal, in the form of an activation function to the neuron.
When the signal is sent to the neuron, it generates a binary output depending on how the neuron reacts to the activation function.
The neuron will either be activated or not.
This activation depends upon the weight of the neuron.
If the neuron is activated, it will then pass a signal onto the next layer within your neural network. It is the multiple layers within an artificial neural network that, if >1, make it deep learning.
Read more about the difference between machine learning and deep learning here.
After the signal has passed through all of the different hidden layers, you then get an output.
Activating the neural network
This output depends on which neurons within the network were activated.
The neuron is activated by the activation function depending on the weights assigned to the different neurons within the network.
I’m not going to go into any more depth on the way that weights are calculated or all of the different functions that go into the neural network now.
Once you have your output, you then need to understand whether or not the model made the correct prediction.
If it didn’t make the correct prediction, you need to find a way of feeding this back into your neural network so that it can learn from it.
Enter the cost function!
Calculating the cost of your mistakes
The cost function is a calculation that tells you how wrong the prediction was.
It does this by measuring the difference between the predicted output value from the model and the correct output.
When you know how wrong the prediction was, you can feed this information back into the neural network. When you feed this data back into the neural network, it can then adjust the weights on the neurons and try again.
One of the more common ways via which you can minimize the cost function, and get a better prediction from your model, is with gradient descent.
Understanding the Theory of Gradient Descent
The job of gradient descent is to minimize the cost function.
Let’s imagine the cost function as a graph.
The further away from the correct output value, you are, the steeper the gradient at the point you sit.
When the model gives a predicted value, the cost function is calculated.
If the predicted value is not the same as the correct value you feed this back into the model and the model will adjust its parameters and try again.
When the predicted output is updated, you also get a newly calculated cost function.
Now you have two points on your cost function graph. You can calculate the gradient between these two points.
This is an instrumental piece of information to know.
Why gradient descent is so important
If you know the gradient between the two points, you know whether or not the adjustments you made to the model before the second prediction moved you towards a better answer or not.
Awesome!
The information from the gradient also gets fed into the network, and it adjusts the weights as appropriate.
You can read more about cost functions and gradient descent here.
Improving the predictions of your artificial neural network.
Slowly but surely the model will learn which neurons are the most relevant and the model will get better at predicting the right output.
As the predictions improve, the output of the cost function reduces. As the cost function reduces, so does the gradient between projections until the model is able to predict accurately.
Once you have a model that correctly predicts the output, then you have officially trained your artificial neural network.
Now you are ready to test it on new data.
How do you set up an ANN?
Ok, so that was a lot of information and functions to take on.
I bet you think that creating a neural network for deep learning is going to be really complicated.
Well, you would be wrong!
I am happy to report it is actually pretty simple to implement an artificial neural network using python.
What are the different steps involved in creating an artificial neural network?
There are 4 steps to create an artificial neural network using keras in python.
The first and most crucial step is data preprocessing.
If you want a model that is able to make predictions accurately in the real world, you have to clean your data set!
Remove any outliers, missing or inaccurate values.
To prevent the algorithm isn’t over biasing for any one feature of your data set, you also need to apply feature scaling.
Finally, if you have any categorical input values, you need to encode them.
Step 2: Import the functions
Step 3: Create the layers
Step 4: Fitting the model to the data
Ready to get started with Machine Learning Algorithms? Try the FREE Bootcamp
Evaluating the performance of your ANN
Once you have your neural network, you then need to fit it to your dataset and test it.
You can evaluate the performance of your neural network using techniques you may already be familiar with, such as the confusion matrix.
This post has details of how to implement the confusion matrix and what it does.
There you have it! You just created your first ANN!
Congratulations!
Advertising Disclosure: I an affiliate of Udemy and may be compensated in exchange for clicking on the links posted on this website. I only advertise for course I have found valuable and think will help you too.
If you have found this content helpful, I recommend the course linked below which gave me a baseline understanding of the materials and python code shared here.
Deep Learning A-Z™: Hands-On Artificial Neural Networks