"AI or not AI", that is a question.
Michael Vischmidt

"AI or not AI", that is a question.

I have always been curious whether AI is real intelligence 🤖. Do I use AI? Of course. In exactly the same way I use a calculator or a computer. Do I consider it intelligence? No. And if we avoid asking why—because that inevitably leads to the question of consciousness, where no clear answers exist—it is more useful to simply understand how the algorithm communicates with us. Whether this should be called “intelligence” or not is a conclusion everyone can make for themselves.

If we ignore philosophy and look strictly at mechanics, everything becomes simpler and, at the same time, colder 🧊. No miracle. No fear. No “mind”. No will. There is an algorithm, data, and statistics. What follows is exactly that. No stories, no poetry, no biographies. Just the logic on which modern AI actually stands.

Modern AI, as people use it today, is not a system of understanding. It is a system of probabilistic continuation. It does not know what meaning, ideas, or intention are. It knows only one thing: given this context, this element most often comes next. That is all. This must be fixed immediately, otherwise everything else becomes distorted.

At its core lies the task of predicting the next element in a sequence. Not an answer. Not a thought. Not a solution. The next element. This element is called a token. A token is not necessarily a word. It can be part of a word, punctuation, a number, or a special symbol. Text is first split into tokens, and the model operates not on letters or sentences, but on this stream of tokens.

The model receives a sequence of tokens of fixed length, for example 2048 or 8192. This is its attention window 👁️. Everything outside this window does not exist for it. No memory. No “I remember what you said earlier”. Only the current window.

The model’s task is to calculate the probability of every possible next token 📊. Not to choose immediately, but to compute a probability distribution. For example, after the sequence “nuclear reaction leads to”, the model may assign “release” a probability of 0.42, “chain” 0.31, “heating” 0.07, and so on across a vocabulary that may contain tens of thousands of tokens.

Where do these probabilities come from? They are not hardcoded. They are learned from data. This is what training means and what differentiates one model from another. Like the difference between a medical school library and an engineering school library 📚. The idea of training is simple. The scale is brutal.

A text is taken, a fragment is cut out, the last token is hidden, and the model must guess it. If the guess is poor, the internal weights are slightly adjusted. This process is repeated trillions of times 🔁.

Inside the model there are no rules, no logic, no concepts. There are billions of numbers called weights. These weights determine how strongly each previous token influences the probability of each next token. Training is nothing more than adjusting these numbers to minimize average prediction error.

The key point is this ⚠️: the model does not know what it is doing. It does not know that it is learning. It does not know that text is text. It simply minimizes a mathematical error function. Everything else is human interpretation.

Why does this work at all? Because language is statistical. People believe they speak freely, but in practice rely on stable constructions, clichés, and repeated patterns. Even complex texts obey probabilistic regularities. That is why models trained only to predict the next token begin to look “intelligent”.

Then comes the architecture that makes this scalable. It is called the transformer ⚙️. It uses an attention mechanism, meaning that every token in the window can influence every other token, with different weight.

Attention answers one question: what matters most right now. For example, in a sentence containing the word “cell”, the model checks whether nearby words are “mitochondria”, “nucleus”, “DNA”, or “bird”, “cage”, “bars”. The probability distribution shifts accordingly 🧬. Not because the model understands biology, but because such combinations appeared more frequently in the training data.

It is important to understand that attention is not consciousness. It is matrix multiplication, measuring correlations between vectors. All apparent meaning is a side effect of statistics.

Once the probability distribution for the next token is computed, a selection step follows 🎲. One can always pick the most likely token. The result will be repetitive and dull. Or one can introduce randomness and sample proportionally to probability. The text becomes more varied, but errors become more likely. Parameters like temperature or top-p simply control this balance.

After selection, the token is appended to the sequence, the window shifts, and the process repeats. No plan. No intention. No understanding of where the text is “going”. Only iterative continuation 🔄.

Why does the model sometimes solve problems, write code, or explain complex topics? Because the training data contained millions of examples of people doing exactly that. The model does not derive solutions. It replays statistical patterns. Sometimes they match reality. Sometimes they do not. The model cannot tell the difference.

Why can the model hallucinate? Because it has no concept of truth. Only plausibility. If confident statements usually followed similar contexts, the model will confidently state things here as well—even if they are wrong. Ask in which year Yuri Gagarin flew to the Moon and you may get 1961 🚀. That was the year of the first human spaceflight, not a lunar mission. The model does not see the error.

Why this is not intelligence. Intelligence implies goals, understanding, causal models of the world, and the ability to distinguish error from truth beyond statistics. None of this exists here. There is only approximation of distributions.

Why it is still useful. Because many tasks reduce to pattern manipulation. Search, translation, summarization, programming assistance, text analysis. AI does not think, but it guesses very well 👍. This saves time and sometimes provides guidance. But this is not AI’s thought. It is the accumulated trace of what thousands or millions of people have already done.

AI can replace conversations with parents, psychologists, priests, or rabbis—but the advice is not for you personally. It is generic. And it is not guaranteed to be correct.

Its limitations matter. The model has no access to reality. No sensors. No experience. No real-world feedback unless explicitly added. It does not know that its answer changed anything. It does not learn during the conversation. Each response is an isolated act of generation.

That is why words like “understands”, “decides”, or “is aware” are metaphors. Convenient. False. At the algorithmic level there are only numbers, probabilities, and linear algebra at industrial scale.

Returning to the original question: is this intelligence? Technically, no. It is a probabilistic continuation machine. When your phone suggests three possible next words while you type a message 📱, this is the same principle in miniature. Functionally, AI often behaves like intelligence because language itself is crystallized human intelligence. The model is not smart. It reflects accumulated statistical human behavior.

One last important point. The more AI-generated text flows back into training data, the stronger the feedback loop 🔄. Statistics begin feeding on themselves. Diversity decreases. Clichés intensify. Distributions distort. This is a fundamental problem and it is not solved by simply increasing parameters.

There is no machine uprising here. But there is a real risk of information degradation. Not because AI is malicious, but because it is indifferent 😐. When AI outputs “Stop, Michael, don’t get worked up, let’s fix everything”, this is not its thought. Millions of Michaels heard this from humans before. The phrase is simply statistically likely. Nothing personal.

If we cut to the bone, AI is a very expensive, very fast, and very confident probability calculator for language 🦴. Everything else is projection.

This technology is a powerful tool that does not possess real consciousness but can imitate patterns in human language with remarkable speed and accuracy, relying more on probability than true understanding.

To view or add a comment, sign in

More articles by Michael Vischmidt

Others also viewed

Explore content categories