The Complete Beginners Guide to Deep Learning
Conquering neural networks for newbies, novices, and neophytes
This article first appeared in Towards Data Science
We live in a world where, for better and for worse, we are constantly surrounded by deep learning algorithms. From social network filtering to driverless cars to movie recommendations, and from financial fraud detection to drug discovery to medical image processing (…is that bump cancer?), the field of deep learning influences our lives and our decisions every single day.
In fact, you’re probably reading this article right now because a deep learning algorithm thinks you should see it.
Photo by tookapic on Pixabay
If you’re looking for the basics of deep learning, artificial neural networks, convolutional neural networks, (neural networks in general…), backpropagation, gradient descent, and more, you’ve come to the right place. In this series of articles, I’m going to explain these concepts as simply and comprehensibly as I can.
There will also be cats.
Learning is so much easier when it’s sprinkled with a little silly.
Photo by skeeze on Pixabay
If you get into deep learning, there’s an incredible amount of really in-depth information out there. I’ll make sure to provide additional resources along the way for anyone who wants to swim a little deeper into these waters. (For example, you might want to check out Efficient BackProp by Yann LeCun, et al., which is written by one of the most important figures in deep learning. This paper looks specifically at backpropagation, but also discusses some of the most important topics in deep learning, like gradient descent, stochastic learning, batch learning, and so on. It’s all here if you want to take a look!)
For now, let’s jump right in!
Photo by Laurine Bailly on Unsplash
What is deep learning?
Really, it’s just learning from examples. That’s pretty much the deal.
At a very basic level, deep learning is a machine learning technique that teaches a computer to filter inputs (observations in the form of images, text, or sound) through layers in order to learn how to predict and classify information.
Deep learning is inspired by the way that the human brain filters information!
Photo by Christopher Campbell on Unsplash
Essentially, deep learning is a part of the machine learning family that’s based on learning data representations (rather than task-specific algorithms). Deep learning is actually closely related to a class of theories about brain development proposed by cognitive neuroscientists in the early ’90s. Just like in the brain (or, more accurately, in the theories and model put together by researchers in the 90s regarding the development of the human neocortex), neural networks use a hierarchy of layered filters in which each layer learns from the previous layer and then passes its output to the next layer.
Deep learning attempts to mimic the activity in layers of neurons in the neocortex.
In the human brain, there are about 100 billion neurons and each neuron is connected to about 100,000 of its neighbors. Essentially, that is what we’re trying to create, but in a way and at a level that works for machines.
Image by GDJ on Pixabay
The purpose of deep learning is to mimic how the human brain works in order to create some real magic.
What does this mean in terms of neurons, axons, dendrites, and so on? Well, the neuron has a body, dendrites, and an axon. The signal from one neuron travels down the axon and is transferred to the dendrites of the next neuron. That connection (not an actual physical connection, but a connection nonetheless) where the signal is passed is called a synapse.
Image by mohamed_hassan on Pixabay
Neurons by themselves are kind of useless, but when you have lots of them, they work together to create some serious magic. That’s the idea behind a deep learning algorithm! You get input from observation, you put your input into one layer that creates an output which in turn becomes the input for the next layer, and so on. This happens over and over until your final output signal!
So the neuron (or node) gets a signal or signals (input values), which pass through the neuron, and that delivers the output signal. Think of the input layer as your senses: the things you see, smell, feel, etc. These are independent variables for one single observation. This information is broken down into numbers and the bits of binary data that a computer can use. (You will need to either standardize or normalize these variables so that they’re within the same range.)
What can our output value be? It can be continuous (for example, price), binary (yes or no), or categorical (cat, dog, moose, hedgehog, sloth, etc.).If it’s categorical you want to remember your output value won’t be just one variable, but several output variables.
Photo by Hanna Listek on Unsplash
Also, keep in mind that your output value will always be related to the same single observation from the input values. If your input values were, for example, an observation of the age, salary, and vehicle of one person, your output value would also relate to the same observation of the same person. This sounds pretty basic, but it’s important to keep in mind.
What about synapses? Each of the synapses gets assigned weights, which are crucial to Artificial Neural Networks (ANNs). Weights are how ANNs learn. By adjusting the weights, the ANN decides to what extent signals get passed along. When you’re training your network, you’re deciding how the weights are adjusted.
What happens inside the neuron? First, all of the values that it’s getting are added up (the weighted sum is calculated). Next, it applies an activation function, which is a function that’s applied to this particular neuron. From that, the neuron understands if it needs to pass along a signal or not.
This is repeated thousands or even hundreds of thousands of times!
Photo by Geralt on Pixabay
We create an artificial neural net where we have nodes for input values(what we already know/what we want to predict) and output values (our predictions) and in between those, we have a hidden layer (or layers) where the information travels before it hits the output. This is analogous to the way that the information you see through your eyes is filtered into your understanding, rather than being shot straight into your brain.
Image by Geralt on Pixabay
Deep learning models can be supervised, semi-supervised, and unsupervised.