The Future of Artificial Intelligence

Immanuel Schweizer

Published Jul 16, 2019

(This is a summary of Merck's White Paper outlining our AI research strategy)

Deep Learning has propelled artificial intelligence (AI) to the top of everyone's mind. In fact, in 2016 I wrote a high-level guide to deep learning to introduce it to non-technical people and since then the awareness of AI has only increased with success stories such as Google's DeepMind defeating Lee Sedol in the game of Go.

The current wave of interests in AI can be traced back to 2012 when Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton published their ImageNet contest results and introduced the world to convolutional neural networks. Since 2012, the architectures of neural networks have become increasingly more complex and applications have exploded. However, despite the success stories, deep learning is still severely limited.

Enormous amounts of labeled data are required to train the networks. Where a human child might learn to recognize an animal species or a class of objects by seeing only a few examples, a deep neural network typically needs tens of thousands of images to achieve similar accuracy.

Source: https://arxiv.org/pdf/1412.6572.pdf

On top of that, today's algorithms clearly are far away from grasping the essence of an entity or a class in the way humans do. Many examples show how even the most modern neural networks fail spectacularly in cases that seem trivial to humans. In the example of the panda above where small amounts of seemingly random noise can change the class of the image (Source). And while using pre-trained networks is becoming increasingly more popular, neural networks are still mainly trained for a single task and do not easily transfer to other tasks or domains.

In the AI community, an agent that is good at one task is called a Weak AI. In comparison, intelligence, also called Strong AI or Artificial General Intelligence (AGI), can be defined as follows:

Intelligence measures an agent’s ability to achieve goals in a wide range of environments. ~ Legg and Hutter (2007)

Our central hypothesis, which drives our research strategy, is that current algorithms, such as deep learning, are fundamentally limiting our ability for Strong AI. The fact that current deep learning is working so well should not blind us to the limitations mentioned above. We also believe that solving this will require new algorithms instead of incrementally engineering better deep learning algorithms and architectures.

As you might know, the conceptual foundations of deep learning are actually rather old; neural networks were already being intensely studied in the 1950s and 1960s, inspired by the brain's anatomy -- according to the understanding at that time. But today, we require a new generation of algorithms; one that must be inspired by today's neurosciences and -- to some extent -- by advances in our understanding of the brain which are yet to come.

There is a couple of implications from the central hypothesis of a new generation of brain-inspired algorithms. First, we should focus on what the brain can do easily but is still hard to do for modern deep learning. Second, we need to understand what neuroscience has learned since the 1950s and where the limitations of our understanding are. And third, we shall focus on fundamental theoretical advancements and not incremental engineering.

If we dive into what separates brains from modern deep learning, we immediately observe the fact that the brain learns mostly unsupervised. Sometimes there is a reward signal (as in reinforcement learning) and sometimes there is some level of supervision (as in supervised learning). But reward and supervision are sparse. Most of our fundamental understanding of how the world works seem to be acquired in an unsupervised way. And to facilitate that, our brain is able to extract invariant representations of entities from sensory input.

The brain seems to learn, store, and perceive representations of entities (e.g. songs, animals, tastes) in a way that is invariant to most transformations. You can rotate the coffee cup on the table, it will still be a coffee cup; you can play a song in a different key it is still the same song; and you can eat a steak well done, medium, or rare and it will still taste like a steak (even though you might prefer one over the other).

These invariant representations seem to be the building blocks that form the basis of all higher cognitive functions; common sense largely rests on the ability to extrapolate ideas and concepts in a flexible way. Abstract thinking might also be based largely on invariant representations. Even when thinking about something as abstract as a mathematical formula, we tend to have visual or auditory impressions in our minds - be it of how the formula looks like when written in a textbook, the sound of pronouncing it, or some visualization of a concept that this formula describes.

While invariant representations seem to be natural for brains they are not the default in deep learning. In fact, even modern convolutional neural networks, which should at least be invariant to translations, lose this property sometimes. Hence, we will focus our research efforts on the challenge of extracting invariant representations in an unsupervised way.

Our envisioned path forward is an interdisciplinary research effort at the intersection of mathematics, computer science, and neuroscience, enabling us to tackle this challenge from three different angles:

A mathematical theory of invariant representations and how to learn them in an unsupervised way
Prototypical implementations of unsupervised representation learning algorithms
Inspiration and validation of our algorithms by comparison to biological brains

While this is a monumental task, we believe it is worth trying. In 1978, Mountcastle postulated the idea that there is a single cortical algorithm that is flexible enough to handle all tasks in the cortex. Under this assumption, we believe it to be possible to invent algorithms with similar properties, providing us not only with new tools but maybe with a new understanding of how the brain works.

If you want to learn more about the AI research team at Merck visit us at https://www.merck.ai/ (From the US or Canada you can visit https://www.emdgroup.com/en/research/ai-research.html). If you want to learn even more about our research you can watch the talk below by our Global Head of Data Science Helmut Linde or check out the white paper.

We obviously cannot do all of this alone and are open for your help; be it as a team member, through research collaboration, or by providing general feedback on the white paper's ideas in the comments below.

Paul Czodrowski 6y

Which sort of phama/chemistry challenges will you be tackling?

1 Reaction

Sebastian Härtner 6y

I was not aware that we have such strong research in AI

2 Reactions

See more comments

To view or add a comment, sign in

The Future of Artificial Intelligence

Immanuel Schweizer

More articles by Immanuel Schweizer

Others also viewed

Understanding the AI Stack: ML, DL, Neural Networks, and Generative AI

Industry use cases of Neural Networks and elaborating how it works

The Deeper Mind

Geoffrey Hinton: The Godfather of Artificial Intelligence

Deep Learning - What to watch out for

Deep Learning Neural Networks frameworks: a focus on hardware accelerators

Revolutionizing AI: The Power of Transformers

Know Who the Godfather of AI is? You should.

What's Wrong with Deep Learning and Why Everything Will Be Fine

Explore content categories