Attack of the Clones – Computers build smarter computers using Deep Reinforcement Learning
https://www.magicalquote.com/moviequotes/oh-my-goodness-shut-me-down/

Attack of the Clones – Computers build smarter computers using Deep Reinforcement Learning

In his famous book “Wealth of Nations” Scottish economist and philosopher - Adam Smith famously quoted - “Every man is rich or poor according to the degree in which he can afford to enjoy the necessaries, conveniences, and amusements of human life.”

He further explains how people are either deemed rich or poor by the measure of quantity of labour they have at their command. i.e., How much wealth they have created and accumulated.

He makes a very solid point in this book that was published a few centuries ago, before humanity knew about electricity or computers. He says - “It was not by gold or by silver, but by labour, that all the wealth of the world was originally purchased(created)”.

This powerful idea forms the very basis, of all of the wealth that we enjoy today. Humanity created its wealth by converting its labour, into something having intrinsic value, that could be exchanged.

Why is Adam Smith’s work so important and what has deep reinforcement learning got anything to do with it? Good Question.

Let’s first understand Smith’s argument.

He establishes that as society progresses and specialises, the human being (aka You and I) become highly interdependent on each other for survival and for the fulfilment of our needs, wants and desires.

We create what may have value, to trade, and then trade the goods or services using a medium of exchange for value, usually called “Money”.

The central idea to remember is - A person or group of people (corporations/nations), create either goods or services that someone else on this blue planet desires and is ready to exchange value in terms of money for these goods or services.

The wealth then, is a function of the quantity, quality and scarcity of the goods and services a nation/corporation can create and how it can differentiate from others so that it can demand a premium in the exchange.

Today most of the products and services created by any nation or big corporation are powered by a small black wafer made up of silica and gold – The microchip or the microprocessor.

These are at the heart of most of our tech gadgets today, from smartphones, to automotive to medical equipment to advanced weaponry. Everything we do today is powered thanks to these microprocessors.

Hence these have started gaining strategic importance in global geo-political equations. Whichever nation or group of nations, controls the flow of these microprocessors, controls possibly the largest share of wealth and future wealth creation on this planet.

Hence, countries and corporations alike, are in the race to build better, faster and cheaper (to manufacture) processing units, which has led to the advent of GPUs and now TPUs, because the plain old CPU couldn’t cut it anymore.

But manufacturing these microprocessing units isn’t an easy task. There are significant barriers to entry. The design & development is prohibitively costly, the actual manufacturing process is super specialised and, unlike t-shirts, cannot be lifted and shifted (outsourced) to a low-cost geography at whim.

Also, companies had slowed down pushing the envelope on new/better designs because the cost of new chip designs keep rising with every design cycle while the gains offered by new design may not significantly move the compute needle.

When it started looking like humanity had hit the plateau for better and faster microprocessors, this design field was disrupted by Google using deep reinforcement learning recently.

What is Deep Reinforcement Learning?

Deep reinforcement learning is a sub-field of deep learning which combines reinforcement learning and deep learning.

Reinforcement learning is a type of machine learning field where an algorithm learns to train itself for a specific goal, in a defined arena, from its own actions and their results within that arena.

When you combine this with deep learning, you get an algorithm that can train itself on large unstructured datasets without the need for manual feature engineering or labelling effort. In other words, you set the reinforcement learning agent free to learn on its own.

Engineers at Google recently used deep reinforcement learning to design an upcoming version of their latest TPU (Tensor Processing Unit).

They pretrained a graph neural net for 48 hours on a dataset of 10,000 chip designs to generate transferrable representations of chips. Then used reinforcement learning to specify the loss function.

They used the ‘state’ associated with a given chip design as input, and the reward was a label associated with the reduced wire length and congestion.

After training and fine tuning, this system was able to design a TPU state in six hours. This new ‘state’ designed by the reinforcement learning agent was better than the current TPU design completed by Google’s human team. It either matched or outperformed the human design chip in areas like chip size, wire length and power consumption.

This AI-powered chip design could be a game changer for the tech industry.

Its is a huge availability and a scary notion at the same time. As this almost sounds like the start of a perpetual, self-fulfilling prophecy where smart machines build even more smarter machines.

To tie this back to Adam Smith’s argument – If machines keep building other machines which can in turn build better products and offer better services than any human can possibly do, who would then own the wealth hence created.

That being said, its a very exciting availability which opens the doors to a realm of possibilities technically, economically and geo-politically.

The original paper published for this idea could be found here. I am happy to discuss the paper in depth if you wish to have a virtual tech banter.

Note: Some concepts in this article have been purposefully kept simple for general readers. If you find a material mistake in an interpretation / explanation or statement please let me know and I’ll be happy to make amends.

 

Good article Balram. Makes it simple to understand this new and exciting development.

To view or add a comment, sign in

More articles by Balram Dabhade

Others also viewed

Explore content categories