LLMs, AI, Commander Data and Me
My personal history with AI technology looks like this.
As a kid with an 8-bit computer and a black and white television, I loved Star Trek and I dreamed of a computer with intelligence like those that sometimes appeared opposite Captain Kirk and Mr. Spock. I saw Tron in the theater and imagined little programs inside the computer living their lives on the Grid.
I got older, I read Asimov’s robot books, The Next Generation introduced me to Data, I continued to see intelligent computers as a cool piece of a potential future just like replicators, communicators, warp drive, and transporters, even as the computers I used were progressing from primitive 8-bit machines to 16-bit and 32-bit computers with graphical user interfaces, CD-ROM drives, photo realistic graphics, and the internet.
When I started my career in software engineering shortly out of high school, I continued to toy around with neural networks and various forms of artificial life and intelligence, but more as a sort of curiousity, never as a practical component of my computing life. After all, what passed for “artificial intelligence” back then was less than impressive and had little in the way of practical application.
Time passed. More time passed. Around 2020, along with the rest of the tech world, I started to read about GPT-3 and how it represented a fundamental advance in natural language generation and processing. I perked up. When Dall-E dropped in 2021 I played around with it, generated a few crummy images. I remember finding it promising but, again, not exactly useful for much. I was already familiar with machine learning and neural networks and image recognition so seeing all of those technologies applied in this new way seemed like a rational next step in the evolution of tech, but hardly Star Trek.
But then things kinda went off the rails and I was genuinely surprised. This branch of technology, previously considered to be somewhat esoteric and seemingly impractical (from a business standpoint) suddenly had it’s moment. It was no longer about Deep Blue playing chess or the competition to build a self-driving car, no, there was something magnetic about LLMs that triggered a frenzy. I was a bit confused.
Surely people understood that the word generator, as impressive as it was, was simply a cleverly constructed remix machine, right? I mean, with enough compute and a large enough sample size and the right methodology for mapping language constructs and semantic concepts, the emulation of language was a mechanical process that did not necessarily imply some sort of “intelligence” under the covers. Yeah, it’s a convincing illusion, but the ghost in the shell is literally just the words, thoughts, ideas, and work of a lot of humans being analyzed and regurgitated by a set of computer algorithms, not a brain. Why were people getting so worked up about it? Why were people suggesting that sentience was imminent?
I remember my first conversation with ChatGPT and trying to talk about music with it. It got into a state where it kept generating glowing praise for Led Zeppelin like some brain damaged Reddit troll who had never heard of any other band. It was funny, it was strange, but it wasn’t a life changing moment. I just found myself wondering what that output implied about the source of the training data and the developers at OpenAI.
At first the way people were responding to this tech kinda weirded me out, to be honest. I understood that this was one of those “uncanny valley” situations but where all I could see was the “artificial” there seemed to be a majority who were seeing “intelligence”. It felt as if everybody had started believing that the people on their televisions were little humans who lived inside the screen. I also couldn’t immediately think of any particularly great uses for this media synthesizing tech that weren’t better served by other solutions. LLMs are generally not a reliable source of information, they are not intelligent in any meaningful sense of the word, and yet they were suddenly being shoehorned into literally every product in existence and being worshipped and “married” by people. Kinda creepy if you ask me.
This didn’t stop me from following along, trying each new flavor. I generated images, I generated a couple of terrible songs, I used an “AI” coding assistant for the better part of a year, and I was continually unimpressed. When the LLM was “assisting” it generally cost me more time and effort than doing the job myself. I started disabling every AI feature that appeared in every piece of software and hardware I encountered as a matter of preserving my own sanity.
No, don’t summarize my emails. I want to read them and know what people actually said.
Recommended by LinkedIn
No, don’t try to guess what I might be thinking. I know what I’m thinking, I’ll type the whole thing for myself, thanks.
And don’t you dare correct my prose.
These qualms aren’t issues with LLM technology but instead with the rather uninspired, lazy, uninteresting, and unimaginative ways that various product teams at various tech companies have tried to use this new technology. They aren’t fundamentally reinventing anything, just slapping crappy, annoying, and mostly useless features on to anything made of code so they can claim to be AI forward.
Let me put it this way: 99% of the “AI” features I have encountered remind me of the first wave of iPhone apps like the level, the virtual beer, the “flashlight” that was just a white screen, you know… dumb.
That’s not to say that there haven’t been exceptions. There have. AI assisted video upscaling and audio mixing and mastering tools have become invaluable aids in my recording studio and video production projects. Coding agents for software engineering have made massive progress in terms of their ability to help with writing software over the last 9-12 months and I’m beginning to find that application useful as well.
But my point is this. Ignore the whole quest for AGI for a moment. The thing that is clear is that we have a long way to go before this technology finds it’s proper niche. I cannot and do not use LLMs for writing assistance even at the level of a business email because the resulting words never sound like me. Even if I write something and have an LLM attempt to “improve” or proofread it, I always hate the result. I can handle being spellchecked by a computer but that’s about it. As a creative person I find that generative media is generally offensive and useless to me and I will go pretty far to avoid “AI slop”. The idea of using an LLM as a substitute for a friend, a creative partner, or even just consulting one for ideas or inspiration holds no appeal to me. These are currently the most popular use cases but I doubt they will be the main ones in the future. At least, I hope not. GenAI when used as a substitute for genuine human interaction and creativity makes everything less interesting. I sincerely look forward to the backlash against AI slop and the inauthenticity of generated media from the simple corporate email to the Sora video. It’s all slop and that cannot possibly be the best niche for this tech. The tech is too interesting for these lazy and ugly applications.
And yet, it is a part of my job to be tech forward, to evaluate and understand new technology and where it fits into solving problems; to help build solutions with the latest and greatest features and functionality. When I look at the topic without the sunny optimism of the boosters or the dour pessimism of the haters, I just see a new kind of data retrieval and processing system, based around natural language, which has triggered a strong instinct in people towards anthropomorphizing it because their primate brains equate language usage with the presence of mind, intent, and sentience. As a technology, it’s potentially extremely useful, no matter how it may be abused or misunderstood.
I have therefore arrived at my current level of engagement with this tech: the pragmatic and the hopeful. I have been devouring the literature on how LLMs are trained, learning to run them on my own hardware so that I can use them in a secure, responsible, local manner (avoiding the cloud providers, subscriptions, data centers, and privacy concerns) and I am even in the process of building an LLM from scratch just to make sure I truly understand this new tech at a hands-on level.
It’s cool tech. It’s potentially very useful. It’s not alive. It’s not aware. It’s not worth the hype or the fallout that will happen when the bubble bursts, but it’s an advance in the state of an art that has been my bread and butter for my entire adult life. It has a place in the world, even if it’s not going to cure cancer or save the planet. When the hype cycle is over it will still be here and playing a major role in all of our lives (unlike fads like NFTs which are about as rational as pet rocks…). I am looking forward to the time when this stops being a hype bubble and is just another tool in our collective toolboxes to make computers more useful. I’m devoting time and attention and mental processor cycles to the various ways that this technology can be used ethically, responsibly, and in ways that provide actual value.
But I’m holding off on planning to meet Commander Data any time soon.