EP. 1 - Guilty by Algorithm

Emmanuella Aston

Published May 13, 2025

Introduction

AI is, at its core, the simulation of human intelligence. Its power lies not in originality, but in imitation, drawing from massive datasets built on human behaviors, expressions, and interactions to create models that appear intelligent.

Now here’s the irony:

We are increasingly being accused of “sounding like AI”, as if it is us who must now prove our humanity. But if AI is trained on human-created data to imitate human expression, then what happens when a human sounds... too human? We become guilty by default, not because we’ve plagiarized or fabricated, but because we’ve become indistinguishable from the system built to replicate us.

In this first episode, we’ll examine how algorithmic content verification systems like AI detectors, designed to detect AI-generated content, can produce false positives. And how, in doing so, they risk discrediting authentic human contributions, especially in academic, legal, and scientific spaces where credibility is paramount.

This isn't just a technical failure; it's a systemic blind spot with real consequences, because when algorithms can't tell the difference between machine and the mind, the burden of proof unfairly shifts to the human.

How AI Detectors Work

AI detection tools function as probability estimators. They aim to gauge the likelihood that a given piece of content (e.g., text, digital image, lines of code, or multimedia creations) is AI-generated, rather than from the human mind. To achieve this, the detectors analyze the content for patterns, underlying structures, and embedded metadata (Grammarly, 2025).

For better understanding, let’s breakdown the steps involved in analyzing a piece of content for the purpose of detecting whether or not the content is AI-generated. Seeing as the central idea of this discourse is centered on research authenticity, the focus of this breakdown will be on text-based content.

AI detectors analyze text to estimate the probability of its content being AI-generated using computational linguistics and machine learning. The machine learning model acts as the brain in the process, trained on human and AI text to recognize their distinct patterns. Computational linguistics, on the other hand, involves a detailed analysis of the text style, like grammar, sentence structure, and overall tone (Andreyev, 2025).

The analysis process typically involves the following key stages (Andreyev, 2025):

1. Tokenization: The text is broken down into smaller bits, called Tokens, which can be individual words or sub-word fragments. This allows for granular analysis of word frequency and sequence.

2. Feature Analysis: Here, the linguistic characteristics of the tokenized text is examined. Key features of this stage include:

- Predictability: If the text is too predictable, it’s likely to be classified as AI-generated.

- Style Variability: Human writing tends to be variable, while AI is more consistent and uniform.

- Patterns: AI typically repeats certain words and phrases based on its training data. Therefore, the repeated use of certain words and phrases may increase the probability of a text being classified as AI-generated.

- Vocabulary and Complexity: AI maintains a uniform vocabulary and complexity in its writing, while humans tend to be more inconsistent in the vocabulary and level of complexity they adopt at different points in the writing.

- Semantic Embeddings: More advanced detectors can convert the text to numerical vectors that capture their meaning, and then compare them to the vectors of AI-generated text samples.

3. Machine Learning Analysis: The extracted features are fed into pre-trained machine learning models for classification. The machine learning models for AI detection are trained using data classified as AI-generated and human-written, which enables them to distinguish between both based on the patterns and features already identified.

4. Probability Scoring: The output is typically a probability score, indicating the likelihood or otherwise that the text was generated by AI.

In summary, AI detectors operate by breaking texts into smaller components, analyzing them based on a range of linguistic characteristics, and then using machine learning models to determine the probability of it being AI-generated or not.

Recommended by LinkedIn

A New Approach to Tokenization

Rudina Seseri 1 year ago

AI vs ML! What is the difference between them and how…

Long Nguyen 1 year ago

AI and Creativity: Can Machines Really Be Creative?

Pankaj Prasoon 3 years ago

Algorithmic Blind Spot: False Positives

The irony of being “Guilty by Algorithm” is that the qualities we associate with good writing, such as clarity, flow, and a good command of the English language, are now trigger alerts for AI detection tools. And since AI is trained on human-created text, it learns to emulate these traits. As a result, when human writing reflects these same traits, it can appear too good to be true to the algorithm, leading it to flag the work as AI-generated.

No AI detection tool can claim 100% accuracy, they have blind spots that can result in false positives. For clarity, a false positive occurs when a legitimate human-written content is incorrectly classified as being AI-generated. These inaccuracies can have significant side effects, such as a student being accused of academic misconduct or a researcher having their legitimate findings questioned.

Let’s consider some characteristics that might trigger a false positive alert:

1. Clarity & Conciseness: AI models are trained to create grammatically correct and well-structured texts without unnecessary jargon. When a human writer also prioritizes these qualities, their work may align with the patterns that AI detectors associate with AI-generated content (Kearns, 2023).

2. Use of Common Language: Within specific academic or professional fields, the use of established terms and common phrases is essential for clear communication. AI models have also been trained on a large amount of that field-specific text, and adopt these expressions in their output. Therefore, authentic work that appropriately utilizes this shared vocabulary might be flagged simply because it contains phrases and terms that are also prevalent in AI-generated content within the same field.

3. Influence of Specific Writing Styles: Formal academic essays, technical reports, and analytical pieces often follow specific structures. The structural similarities within these genres can make it challenging for AI detectors to differentiate between them, leading to a higher chance of false positives.

Real-Life Consequences of "Guilty by Algorithm”

False positives carry significant real-life consequences, especially in fields where trust and credibility are the currency for survival and success. In academia, students and researchers may face unfair accusations of AI misuse, due to algorithmic flags, leading to increased scrutiny, and the very difficult burden of proving the authenticity of the work (Boatwright Memorial Library, 2025). This can cause damage to the reputation, academic standing, and confidence of the persons involved. For legal and scientific professionals, it could potentially undermine trust in expert testimony or research findings (Hirsch, 2021).

The fear of false positives can also stifle original, unique, and authentic creative expressions of writers and researchers, causing them to alter their natural writing style in an effort to avoid being flagged. There have also been reported instances of writers deliberately making mistakes and writing errors to avoid being flagged.

Way Forward

There is an over-reliance on AI detection algorithms, a tendency to trust the output of these systems without paying attention to their limitations and inherent biases. This is further compounded by the lack of transparency and a clear appeal process to prove your case and change the verdict, leaving individuals with little recourse when faced with a false accusation.

Addressing the "Guilty by Algorithm" problem requires a fundamental shift towards nuance and context in content verification. Algorithms alone are insufficient; a more holistic approach must consider individual writing styles, the specific nature of the work, and the disciplinary context. Human oversight is also crucial, with educators, legal professionals, and researchers playing a vital role in reviewing and interpreting algorithmic outputs, rather than blindly accepting them as definitive truth. A call for transparency and accountability is essential, demanding clearer explanations of how these systems function and establishing robust systems for individuals to challenge potentially erroneous findings.

Conclusion

As AI models become more skillful in mimicking human expression, including creativity, emotional tone, and even imperfections, the markers that detectors rely on become increasingly blurred. The more convincingly AI can "sound human," the more likely it is that genuine human writing will be caught in the crossfire of these sophisticated yet fallible detection systems. The line between machine and mind becomes increasingly difficult for algorithms to discern, placing authentic human contributions at greater risk of being deemed "Guilty by Algorithm."

References

Andreyev, A. (2025, March 20). How Does AI Detection Work? A Complete Guide to Identifying AI-Generated Content. Retrieved from RankDots: https://www.link-assistant.com/rankdots/blog/how-do-ai-detectors-work.html

Boatwright Memorial Library. (2025, April 14). Generative Artificial Intelligence - Considerations and Limitations. Retrieved from Boatwright Memorial Library: https://libguides.richmond.edu/genai/considerationsandlimitations

Grammarly. (2025, April 7). How Do AI Detectors Work? Key Methods, Accuracy, and Limitations. Retrieved from Grammarly: https://www.grammarly.com/blog/ai/how-do-ai-detectors-work/#:~:text=AI%20content%20detectors%20analyze%20text,but%20their%20accuracy%20can%20vary.

Hirsch, A. (2021, December 12). AI detectors: An ethical minefield. Retrieved from Northern Illinois University- Center for Innovative Teaching and Learning: https://citl.news.niu.edu/2024/12/12/ai-detectors-an-ethical-minefield/

Kearns, M. (2023, May 03). Responsible AI in the generative era. Retrieved from Amazon Science: https://www.amazon.science/blog/responsible-ai-in-the-generative-era

Mark A. Bassett 10mo

https://www.garudax.id/pulse/ai-detectors-part-1-accuracy-deterrence-fairness-practice-bassett-re2tc

Oladayo Akinmokun 11mo

This is so informative Emmanuella Aston Talk of false positives, I actually tried this myself. I had written two full pages and then gave them to the AI detector. At first it told me, it was likely generated by AI. In my head, I was like "you dey ment". Then I tried it again, but this time, it said it was most likely human. This issue needs to be discussed more. Looking forward to part 2.

1 Reaction

Faith Egenti 11mo

Thanks for sharing, Emmanuella

1 Reaction

Mayowa Ojo Glorious 11mo

This is great stuff! I am glad we are beginning to have these discussions. We truly should aim at finding a balance with the human and machine collaboration.

1 Reaction

Bisola Afolabi 11mo

Thanks for sharing, Emmanuella

1 Reaction

See more comments

To view or add a comment, sign in

EP. 1 - Guilty by Algorithm

Emmanuella Aston

Introduction

Recommended by LinkedIn

Algorithmic Blind Spot: False Positives

Real-Life Consequences of "Guilty by Algorithm”

Way Forward

Conclusion

References

More articles by Emmanuella Aston

Others also viewed

How does ML & AI work in plain language.

Navigating the AI terminology

Understanding Chain-of-Thought Prompting: A Deep Dive into Enhanced AI Reasoning

The Non-technical AI journey Guide Part 1: ABCs for non-techies

How Generative AI "Thinks"

AI-Powered API Mastery: Integrating Intelligent Frameworks and Designing Next-Generation APIs

AI-Powered Solutions: Beyond the Hype

My Thoughts on AI and AGI

The Feedback Gap: Why AI Must Learn to See Its Own Mistakes

Unlocking the Power of Llama: Harnessing AI for PDF Search and Question Answering

Evaluating AI-Generated Content With LLMs

Concerns About AI-Generated Content Quality

Building Trust in AI-Generated Content

Tips for Maintaining Authenticity in AI Writing

Why AI Language Models Generate False Responses

How to Maintain Authentic Voice in AI Content Creation

Explore content categories

Introduction

Recommended by LinkedIn

Algorithmic Blind Spot: False Positives

Real-Life Consequences of "Guilty by Algorithm”

Way Forward

Conclusion

References

More articles by Emmanuella Aston

EP. 2 Innocent by Deception

Quantum Computing and the Security Risks for Data Protection

AI VS AI: THE DOUBLE-EDGED SWORD IN CYBERSECURITY

2024 Cybersecurity Policies Wrapped and the Outlook for a Resilient 2025

“WITHOUT CYBERSECURITY POLICIES, GOVERNMENTS ARE SITTING DUCKS”

Others also viewed

How does ML & AI work in plain language.

Navigating the AI terminology

Understanding Chain-of-Thought Prompting: A Deep Dive into Enhanced AI Reasoning

The Non-technical AI journey Guide Part 1: ABCs for non-techies

How Generative AI "Thinks"

AI-Powered API Mastery: Integrating Intelligent Frameworks and Designing Next-Generation APIs

AI-Powered Solutions: Beyond the Hype

My Thoughts on AI and AGI

The Feedback Gap: Why AI Must Learn to See Its Own Mistakes

Unlocking the Power of Llama: Harnessing AI for PDF Search and Question Answering

Similar topics

Evaluating AI-Generated Content With LLMs

Concerns About AI-Generated Content Quality

Building Trust in AI-Generated Content

Tips for Maintaining Authenticity in AI Writing

Why AI Language Models Generate False Responses

How to Maintain Authentic Voice in AI Content Creation

Explore content categories