A specific DevOps for AI
Focusing on the most common AI mode today, Deep Learning (which includes the very fashionable LLM and GenAI), in the end, creating an AI machine based on Deep Learning, is based on finding a function, starting from a series of examples of the behaviour that we want to emulate.
This function is intended to provide correct outputs for unknown inputs.
In its most general expression, training examples would be pairs of vectors (a finite list of real numbers as input versus a list - not necessarily of equal size as the input - of ordered numbers as response):
f(x1,x2,…xN) -> (y1,y2,…,yM)
We are going to play with an example in which N=2 and M=1. Let's say we want to make an AI machine that is capable of anticipating a system for which:
Premise 1: we don’t know the underlying equation that governs it.
Premise 2: but we can use the finite set value of inputs and outputs that we want to train our system.
If we don't assume Premise 1, we might just simply implement the underlying system avoiding all complications.
We need Premise 2 because of the finite nature of the training process: we cannot stay for always training our AI machine.
To fix ideas, let’s say we have these samples of our underlying system:
Input = (2, 0.9), output = 0.9216
Input = (2, 0.8), output = 0.9216000000000001
Input = (2, 0.7), output = 0.5375999999999997
Input = (2, 0.6), output = 0.15360000000000013
Input = (3, 0.9), output = 0.28901376000000006
Input = (3, 0.8), output = 0.28901375999999973
Input = (3, 0.7), output = 0.99434496000000000
Input = (3, 0.6), output = 0.5200281600000003
One of the possible realities
Imagine for a moment that the system we want to emulate (that we don’t know) follows the discrete logistic equation. This equation can be used, for example, to model certain population dynamics, and has this form:
x(n) = 4*x(n-1)*(1-x(n-1)), for n=1,2,3,...
x(0) = b
Where b and n are our input values.
It would fit with the examples before!
Another possible reality
But imagine that the system follows this other equation:
Recommended by LinkedIn
x(0) = b
x(1) = 4*b - 4*pow(b,2)
x(2) = 4*(4*b - 4*pow(b,2)) * (1 - 4*b + 4*pow(b,2))
x(n) = 16*(4*b - 20*pow(b,2) + 32*pow(b,3) -16*pow(b,4))
*(1 - 4*b + 20*pow(b,2) - 32*pow(b,3) + 16*pow(b,4)), n>=3
The results for the enumerated inputs will be the same. We cannot distinguish with these samples.
But we don't know which reality is real
Put in the situation when we have our AI-machine ready to be used, trained with the examples.
We put into the real world.
And it works… until we use for n=5…. If the underlying system is the discrete logistic equation, but our AI machine has “thought” that the analytic second example was the function that governs the system, it will fail.
If it’s the inverse (the AI “thought” the system follows a discrete logistic but in reality it follows the analytic second case), it will fail also. In fact, we only know if it fails when we use it in the real world, out of the training examples.
I have intentionally left out the fact that the model we want to emulate is sensitive to initial conditions. That is, in the real system, for the same value of n, two very close values of b (for example 0.1 and 0.100000001) will give me different results for x(n). (In the case of n=200, for example, the result of starting with the first value of b=0.1 is 0.08736970074268527 while for the second is 0.7513580493753322). This is very dangerous when we deal with computers that use limited approximations to real numbers and that produce rounding and truncate values in their calculations due to their construction.
Is AI still useful?
From my point of view, it is. But when managing the creation and use of any AI system, it must be assumed that sooner or later it will fail.
Beyond technical improvements, the question is to implement elements that allow detecting those moments of failure and minimising the damage of that failure.
This would include, for example, the ability to immediately disconnect AI performance in a self-driving car, to be excluded by the AI filter in a personnel selection process, to detect and measure racist bias in a facial detection system, to control risk assumed by a system that automatically executes orders in financial markets and in general to implement a model of explanation and rectification during the life of effective use.
An AI machine cannot be "put in production and abandoned"; it has to be managed and its users aware of what can and should be expected from him. It would seem that an evolution of the term DevOps is required to include this “Ops” cycle, which far exceeds that traditionally associated with pure IT, which requires specific tools to measure the degree of failure of the AI system in its execution, to collect data on the perception of success, to allow for “withdrawal” when it no longer works. Sometimes leaving part of those “Ops” also in the users, because they are part of the development of the system, even more so in those who are learning from the use that is given to them.
DevOps now include the concepts of continuous integration, continuous deployment, continuous testing and maybe we should start thinking about like "continuous failing"....
AI systems fail, no panic, no surprises. But we are here to act ethically and manage it, to ensure that the value is higher than the risk, to be transparent and explain.
But needs to be done actively.