Deep Learning

Deep Learning

What is deep learning?


Deep learning is a type of machine learning and artificial intelligence (AI) that imitates the way humans gain certain types of knowledge. Deep learning is an important element of data science, which includes statistics and predictive modeling. It is extremely beneficial to data scientists who are tasked with collecting, analyzing and interpreting large amounts of data; deep learning makes this process faster and easier.

At its simplest, deep learning can be thought of as a way to automate predictive analytics. While traditional machine learning algorithms are linear, deep learning algorithms are stacked in a hierarchy of increasing complexity and abstraction.

To understand deep learning, imagine a toddler whose first word is dog. The toddler learns what a dog is -- and is not -- by pointing to objects and saying the word dog. The parent says, "Yes, that is a dog," or, "No, that is not a dog." As the toddler continues to point to objects, he becomes more aware of the features that all dogs possess. What the toddler does, without knowing it, is clarify a complex abstraction -- the concept of dog -- by building a hierarchy in which each level of abstraction is created with knowledge that was gained from the preceding layer of the hierarchy.

How deep learning works


Computer programs that use deep learning go through much the same process as the toddler learning to identify the dog. Each algorithm in the hierarchy applies a nonlinear transformation to its input and uses what it learns to create a statistical model as output. Iterations continue until the output has reached an acceptable level of accuracy. The number of processing layers through which data must pass is what inspired the label deep.

This article is part of

In-depth guide to machine learning in the enterprise

  • Which also includes:
  • Learn the business value of AI's various techniques
  • 10 common uses for machine learning applications in business
  • 6 ways to reduce different types of bias in machine learnin

In traditional machine learning, the learning process is supervised, and the programmer has to be extremely specific when telling the computer what types of things it should be looking for to decide if an image contains a dog or does not contain a dog. This is a laborious process called feature extraction, and the computer's success rate depends entirely upon the programmer's ability to accurately define a feature set for dog. The advantage of deep learning is the program builds the feature set by itself without supervision. Unsupervised learning is not only faster, but it is usually more accurate.

Initially, the computer program might be provided with training data -- a set of images for which a human has labeled each image dog or not dog with metatags. The program uses the information it receives from the training data to create a feature set for dog and build a predictive model. In this case, the model the computer first creates might predict that anything in an image that has four legs and a tail should be labeled dog. Of course, the program is not aware of the labels four legs or tail. It will simply look for patterns of pixels in the digital data. With each iteration, the predictive model becomes more complex and more accurate.

Unlike the toddler, who will take weeks or even months to understand the concept of dog, a computer program that uses deep learning algorithms can be shown a training set and sort through millions of images, accurately identifying which images have dogs in them within a few minutes.

Deep learning methods


Various methods can be used to create strong deep learning models. These techniques include learning rate decay, transfer learning, training from scratch and dropout.

Learning rate decay. The learning rate is a hyperparameter -- a factor that defines the system or set conditions for its operation prior to the learning process -- that controls how much change the model experiences in response to the estimated error every time the model weights are altered. Learning rates that are too high may result in unstable training processes or the learning of a suboptimal set of weights. Learning rates that are too small may produce a lengthy training process that has the potential to get stuck.

The learning rate decay method -- also called learning rate annealing or adaptive learning rates -- is the process of adapting the learning rate to increase performance and reduce training time. The easiest and most common adaptations of learning rate during training include techniques to reduce the learning rate over time.

Transfer learning. This process involves perfecting a previously trained model; it requires an interface to the internals of a preexisting network. First, users feed the existing network new data containing previously unknown classifications. Once adjustments are made to the network, new tasks can be performed with more specific categorizing abilities. This method has the advantage of requiring much less data than others, thus reducing computation time to minutes or hours.

Training from scratch. This method requires a developer to collect a large labeled data set and configure a network architecture that can learn the features and model. This technique is especially useful for new applications, as well as applications with a large number of output categories. However, overall, it is a less common approach, as it requires inordinate amounts of data, causing training to take days or weeks.

To view or add a comment, sign in

More articles by Rijika Roy

  • Oracle

    What is Oracle? Oracle database is a relational database management system (RDBMS) from Oracle Corporation. This…

  • Tableau

    Introduction to Tableau Every day, we encounter data in the amounts of zettabytes and yottabytes! This enormous amount…

  • GCP

    What is Google Cloud Platform (GCP)? Companies turn to the public cloud for various reasons, but it has become the…

  • Oracle

    What is Oracle? Oracle database is a relational database management system (RDBMS) from Oracle Corporation. This…

  • Python Developer

    What is a Python Developer? Though there are many jobs in tech that use Python extensively — including Software…

  • Hadoop

    What is Apache Hadoop? Apache Hadoop software is an open source framework that allows for the distributed storage and…

  • Data Analytics

    What is data analytics? Data analytics converts raw data into actionable insights. It includes a range of tools…

  • MySQL

    What is MySQL? MySQL, the most popular Open Source SQL database management system, is developed, distributed, and…

  • What is Hive?

    Apache Hive is a distributed, fault-tolerant data warehouse system that enables analytics at a massive scale. Hive…

  • JAVA

    What is Java? Java is a widely-used programming language for coding web applications. It has been a popular choice…

Others also viewed

Explore content categories