AI or a tale of Artificial Knowledge
Every connected person on the planet has heard of Artificial Intelligence (AI) and seen it in action. It drives Tesla cars, gives Cortana forty languages to translate to and from, recognizes faces on iPhones, searches for open pharmacies for Google Home owners, or provides recommendations to Amazon shoppers. And this is just the tip of the iceberg visible on the consumer market. In corporations too, data scientists are using AI to create innovative business applications, automate decision-making, or speed up inefficient processes. This revolution is reshaping the way we live and work profoundly.
Since all those tasks were previously incumbent on humans, people easily lend human abilities to computers, and widely accept the phrase ‘Artificial Intelligence’ as a valid description, hinting at the idea there would be a brain in the machine. But this could not be further from the truth. When asked about AI in a recent interview [1], Cédric Villani [2] admitted that "a better phrase should be found". This is because AI is not intelligent in the same way humans are, nor do we have any certitudes any of the existing endeavours in AI, like deep learning, Artificial General Intelligence or Quantum Computing, will lead to an intelligence capable of matching or surpassing the human brain anytime soon.
The phrase ‘Artificial Knowledge’, inspired by Peter Drucker’s ‘knowledge worker’ [3], is more in line with what AI is and does today, and is not prone to misinterpretations and popular misbelief that machines are on the brink of world domination. We are witnessing a definite shift towards the digital world, but, for better or for worse, man is fully in control.
A large part of the AI that tackles human-like tasks today is based on ‘machine learning’ and its two dominant techniques: ‘supervised learning’ and ‘unsupervised learning’. ‘Supervised learning’ learns from past examples to predict outcomes. ‘Unsupervised learning’ looks for similitudes or patterns in datasets to determine categories, anomalies, or oddities for example.
To illustrate ‘supervised learning’, let’s suppose we want to create a website that provides a second-hand car pricing engine. People wanting to sell their car enter the make, model, year and mileage of their car on the website, and receive a price tag in return. To implement this service the ‘supervised learning’ way, we first collect as many historical data on past car sell transactions as we can get our hands on. We need to make sure the examples we collect contain the right type of information. In our case, we need to have the make, model, year and mileage (called features), as well as the price the car went for (called the label).
We then feed this data into an algorithm, that uses mathematics to model the behaviour of the price relative to the features (make, model…). For example, if the make is luxurious, it will trend the price upward, while an early year will pull it downward. Once the mathematical magic took place, we are left with an evaluation function we use to predict the price of our car, based on its make, model, year, and mileage.
Those interested in discovering the mathematical magic that goes on in the algorithms that build the models can follow the most excellent “mother of all MOOCs” by Andrew Ng on Coursera [4]. As you will discover in the MOOC, a lot of work needs to go into engineering the historical data properly, beyond the powerful algorithms and mathematical tools required to build the prediction function. As the adage “garbage in, garbage out” indicates, it is essential to prepare the data, so it is as exhaustive as possible, representative, balanced and accurate. Quite a lot of work in perspective for the data scientists!
‘Unsupervised learning’ works differently. It looks for things that are different or similar in datasets. It tries to identify patterns that occur in the data. It is often used in anomaly/threat detection or categorization. For example, this type of learning can look at the time it takes users to log into a system, and flag as suspicious the odd users who take longer, or maybe less time than others to go through the login process. In another use cases, a technique called ‘clustering’ is used to find categories (or clusters) in data samples.
All those techniques, which are also the basis for ‘deep learning’ with neural networks, have a common objective: enable machines to learn, in other words acquire knowledge, which is not intelligence per say. Learning is definitely an attribute of intelligent beings, but is not sufficient to fully define them. Intelligence implies the ability to adapt, to invent new strategies when facing unknown situations or new circumstances. Today, algorithms don't go much beyond what they have learnt, although recent techniques like 'transfer learning' have made progress in handling unseen data domains related to the ones used to learn.
Calling AI knowledgeable rather than intelligent does not take away any of its incredible value. Thanks to the Cloud and machine learning, the old encyclopaedia universalis’ dream of knowing everything there is to know is now within arm’s reach. This last January, algorithms surpassed humans at answering questions about a text for the first time [5]. Several research labs, including Microsoft and Google, have cracked the NLP and understanding challenge, and can now process natural language better than man!
Of course, the enormous amount of data that exist in the digital world is not just made up of text. Images, videos, sounds, drawings, diagrams, graphics, schemas, tables are examples of data forms. There is also the question of data time: historical, real time, and even future now that we can make predictions as we saw earlier. Making sense of all those data sources and extracting useful information out of them is called knowledge extraction. It aims at making machines capable of autonomously understanding the semantic of data, so it can learn how to use it to answer natural language questions.
A question like: where can I buy the bike that appeared in the picture on the packet of biscuits I bought in my local store last month? This is a very trivial question for a bike specialist who has seen the packet of biscuits I bought last month. But what is the chance of that? So, if the machine can understand it needs to somehow find the packet sleeve, look at the picture on it, identify the bike and interrogates the maker to find a supplier, it will answer the question. Of course, it needs to do this alone, because we can’t guess what the question will be ahead of time and teach the machine that specific sequence of actions.
Machine learning is making it happen as you read. Can you think of the first question you will ask the machine? Beyond the first question, can you imagine the (endless) possibilities in your domain of expertise or your favourite hobby?
References
[1] https://www.youtube.com/watch?v=SKLTBrBT4js
[2] https://en.wikipedia.org/wiki/C%C3%A9dric_Villani
[3] https://en.wikipedia.org/wiki/Peter_Drucker