Quick Introduction to Deep Learning
First things first. Lets define what is AI, Deep Learning & Machine Learning as these are been used interchangeably. While there is no standard definition to these terms, here is general essence.
Artificial intelligence is a field of study in which machines are made to demonstrate the human level intelligence.
Wikipedia says – Artificial intelligence (AI) is the ability of a computer program or a machine to think and learn. It is also a field of study which tries to make computers "smart"
Machine learning is a field of study in which computer programs use statistical models to predict the outcome based on underlying patterns in the input data
Wikipedia Says - Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems use to effectively perform a specific task without using explicit instructions, relying on patterns and inference instead. It is seen as a subset of artificial intelligence.
Deep learning is the subset of machine learning in which learning is based on data representation and is inspired from human brain’s neural structure
Wikipedia says - Deep learning (also known as deep structured learning or hierarchical learning) is part of a broader family of machine learning methods based on learning data representations, as opposed to task-specific algorithms
So in a nut shell, Deep learning is subset of Machine Learning, which is subset of AI.
Now lets get to deep learning..
Most modern deep learning models are based on Artificial Neural Network (ANN), a computer system model, which consists of number of interconnected nodes resembling human brain. Just how a human brain processes the information, an ANN will also receive, process and output the information. ANN will have at least 3 layers (Input, Hidden and Output) to process the information and we can increase hidden layer to any number. Also each input/hidden/output layer can have any number of nodes. Below is the simple neural network with a Input layer (with 2 input nodes - X1, X2), a Hidden layer(with 3 hidden nodes - H11,H12,H13) and 1 output node (Y_Pred) in Output layer (also 1 bias unit for Input (B1) and hidden layer (B2)).
We first input data to hidden layer along with some random weights. Then each node in first hidden layer will use an activation function to calculate the output. These outputs from first hidden layer will act as inputs for next layer. In our case, since we only have one hidden layer, we use these inputs to predict the output (Y-Pred as shown in above picture). This process is called Forward propagation or Feedforward. We then compare the predicted outcome to actual outcome to see how well our model performed. But our idea of creating ANN is not just predicting the output but predicting it as close to actual outcome as possible. For that we follow below iterative process.
As mentioned above, whole idea is to minimize difference between predicted and actual outcome. For that, first we calculate the difference (error). Next, we perform the back propagation which is the process to spread the error to each of the weights using the chain rule. Finally, we update the weights and rerun the whole process until the model gets better output.
So why use deep learning models when we have machine learning models already?
While there are many reasons one picks deep learning over machine learning models, one main reason is the amount of data. Prof.Andrew Ng says, as the amount of data increases, performance of neural networks increases as shown below.
Type of Deep Learning Networks
Depending on type of data and outcome we are looking for, we can build many different type of neural networks. Here is the comprehensive list of networks and related information written by, Fjodor van Veen from Asimov institute.
Use Cases
There are many deep learning use cases, to name few
- Sentiment Analysis
- Speech Recognition
- Time Series Prediction
- Natural Language Processing
- Gesture Recognition
- Image Recognition
- Data Compression / De-compression
- De-noising Image
- Image Transformation
- Generating realistic image data
- Interactive image generation
Recap
- Deep learning is subset of Machine Learning, which is subset of AI.
- It resembles human neural system hence the name "Artificial Neural Network"
- We input data through highly interconnected hidden layers/nodes to predict the outcome
- ANN is an iterative process
- Central idea of ANN is to minimize the error and maximize the accuracy
- We can apply deep learning models on wide variety of data
- We have different types of networks available to pick based on input data and expected outcome
- There are wide variety of use cases