Is Deep Learning a Brute Force Method?
There is a range of intelligence. An ANN is intelligent if it performs a cognitive function, no matter how simple or complicated. How intelligent is a cognitive function performed by a Deep Leaning ANN?
Deep Learning ANNs have been created and trained to perform several cognitive functions well. They are used to recognize objects like faces, people and other objects. They are used to translate from one language to another. These are not done perfectly, but they do them well enough to be quite useful.
So a person would be tempted to think that the ANNs that perform these cognitive functions must have a high degree of intelligence. But what level of intelligence do they really have? To answer that, let’s look at how Deep Learning performs cognitive functions.
Deep Learning uses the backpropagation algorithm. The goal of any ANN that uses the backpropagation algorithm is to create a mapping of an object present in input data to a node that represents that object in the output layer. The purpose of nodes in the hidden layers is to guide values through the ANN in order to make that happen.
A Deep Learning ANN inputs a dataset that contains a very large number of objects in order to learn the object. It must be trained on hundreds or thousands of objects before its performance in matching them to the correct outputs becomes satisfactory. What’s happening here? It is trained on so many objects of a particular type that it has matched almost every possible appearance of an object to an output. So the reason that it is able to recognize new objects of the same type after it has been trained is because it has already seen almost every possible appearance of that object and the network has been adjusted to match inputs to outputs for all of them.
How does it work? The ANN consists of nodes and links between the input and output layers. The purpose of the training function is to use the backpropagation algorithm to set the weights and biases of the nodes so that when an object within data is copied to the input layer, node values from the input layer are guided through the pathways of links in the hidden layers to the output node that is a match for the object present in the input layer.
Here is an analogy: Picture a labyrinth of millions of water pipes fitted with a large number of valves that control the flow of water through it.
In an ANN, the links are like the pipes. The weights and biases of the nodes are like the valves that adjust the amount of water that may pass through the pipes to the other valves. Provided there are enough layers and enough nodes per layer, there exists a certain perfect combination of settings for the weights and biases of all of the nodes. It will route every combination of inputs that represent a particular object to the node in the output layer that represents it. The number of combinations required increases as more types of objects are to be learned and recognized, so the number of layers must also be increased to permit more combinations of weight and bias settings.
What this all boils down to is that a backpropagation ANN is simply an iterative process that produces a routing from inputs to outputs. Although Deep Learning performs some useful cognitive functions, this mapping function is really only low-level intelligence. After all, the ANN doesn’t understand objects it handles. Its objects don’t have meanings and it can’t be said that the ANN actually knows anything.
Compare the basic approach used by Deep Learning to that used by Deep Blue, the chess-playing computer, to play chess. Deep Blue had multiple microprocessors that enabled it to look ahead several moves and to consider millions of positions, rank each one and then select the position for the next move with the highest score. It was touted as intelligent because it performed a cognitive function better than humans, but it didn’t do any thinking. It just performed a computer algorithm that ran billions of calculations, and that’s all it could do.
Does Deep Learning use a brute force approach? Well, it requires a Herculean effort to provide these mappings. Very large datasets are assembled and fed to the ANN. It runs an algorithm that iteratively makes billions of small adjustments to the weights and biases of millions of nodes. GPUs are used to reduce the amount of time it takes to train an ANN. There are no concepts, knowledge or understanding visible anywhere. So yes, it sure looks like brute force.