From the course: Deep Learning with Python: Hands-On Introduction to Deep Learning Models

What are artificial neural networks?

- [Instructor] Let's start a discussion about deep learning by talking about what an artificial neuro network is at a high level. An artificial neuro network is a computational model loosely inspired by the structure and functioning of biological neurons in the brain. The intricate web of interconnected neurons in the human brain forms the basis for complex information processing, communication, and cognitive functions that underlie human intelligence and behavior. A biological neuron receives incoming signals in the form of electrochemical impulses from other neurons via structures called dendrites. The dendrites serve as the input channels of the neuron where these signals are integrated and processed. If the cumulative inputs exceeds a certain threshold, the neurons fires an electrical impulse, which is also known as an action potential. This electrical impulse travels down the axon, which is the output channel of the neuron. The axon transmits the signal to other neurons through synapsis at the end of the axon called axon terminals, where the signal can propagate further through the neuro network. In an artificial neuro network, neurons or nodes are simplified computational units that use mathematical functions and numerical values to simulate the flow of information. These nodes are typically organized into layers that process data in a structured manner. The input layer consists of input nodes that receive the raw data. Each input node represents one feature of the data. For example, in an image classification task, each pixel of an image could be passed to an input node. The output layer contains neurons that produce a network's final predictions, for example, in an image classification problem as well, the value of the output neurons might correspond to the label of an image, such as whether the image is that of a cat or a dog. Between the input and output layers are one or more hidden layers. Hidden layers are the computational heart of a neural network that enabled a neural network to learn hierarchical representations of the input data. The first hidden layer might detect simple patterns in the data, such as edges in an image, or basic grammatical structures in texts. Subsequent hidden layers build upon these earlier detections to recognize more complex features. For example, combining edges to detect shapes or combining words to understand phrases. Each hidden layer captures a higher level of abstraction, moving from simple to more complex representations. Each node in our artificial neural network receives input in the form of numerical values, either directly from the raw data, which is the case for nodes in the input layer or from the nodes in preceding layers, as is the case for nodes in hidden and output layers. As a node receives input, each input is multiplied by a corresponding weight value. The weight represents the learned importance of that input in determining the neurons output. During training, the network adjusts these weights to improve its performance, learning to prioritize inputs that are more relevant to the task at hand. The node then calculates the weighted sum of its inputs, often adding what is known as a bias term, to shift the weighted sum of values up or down. The weighted sum is then passed through an activation function, which determines what value to output or pass on to nodes in subsequent layers. Activation functions introduce non-linearity into the network, enabling it to model complex patterns in the data. We discuss activation functions in more detail when we get to the course video titled Activation Functions in Neural Networks. By mimicking the hierarchal processing of information in the human brain, artificial neural networks can learn from vast amounts of data to perform tasks such as image recognition, speech processing, natural language understanding, and predictive analytics with remarkable accuracy.

Contents