LLMs are reciting from memory, they are not problem solving
Much has been said off late regarding the limitations of LLMs. I too have some thoughts after using ChatGPT and Le Chat while writing software for the Yoja project.
In my first example, I asked the following question of ChatGPT: give me python code to quantize a roberta model stored in my local directory. ChatGPT provided me two snippets of code, one using the huggingface optimum library and another using PyTorch Dynamic Quantization. So far, so good. However, when I tried to use the code, I encountered an error. Something like 'AttributeError: 'torch.dtype' object has no attribute 'data_ptr''.
Here's my first observation - ChatGPT and other code generation LLMs are not really 'trying' any of the software snippets that they are generating. 'Trying' the software is much more involved and the possibility of combinatorial explosion, a scenario wherein the number of options to 'try' explodes, is very real. Later in this article I have references to a brilliant gentleman François Chollet who talks about this precise phenomenon.
Continuing with the ChatGPT example, I dutifully copied the error message to ChatGPT and continued the chat. ChatGPT gave me two possible diagnoses - pytorch version incompatibility, and no support for quantization in the model. Again, it is merely reciting from memory and not really trying out any of these options and problem solving. I diagnosed the error as being the latter, i.e. the model that I had was not quantization compatible.
In essence, ChatGPT had turned into an extremely sophisticated github code search tool. While it would be foolish for any contemporary software engineer to work without using code generation tools such as ChatGPT and Le Chat, it is worth noting that the LLMs are reciting from memory and not problem solving. Actually solving the problem remains the responsibility of the software engineer. This reminds me of the Kumon series of math training for youngsters that is popular in parts of the world - it emphasizes a great deal of practice and memorization, which is useful for teaching math skills but does not necessarily improve math problem solving capabilities.
Recommended by LinkedIn
Finally, here's a talk by François Chollet titled 'It's not about scale, It's About Abstraction' - This is a brilliant talk. It talks about how LLMs are missing a certain element of human intelligence that is crucial for solving problems. It also has a pointer to the ARC-AGI problem set - a series of puzzles that humans can easily solve, but LLMs have great difficulty with. This is the path to AGI. Here's a link to the ARC Prize website that has much detail: https://arcprize.org/
François Chollet and Mike Knoop offer their superb argument as to why they think progress on AGI has stalled and why LLMs may not be sufficient to reach AGI.
My own understanding of the state of art in solving the ARC-AGI problem set is this: Use LLMs to generate a large set of possible solutions for the puzzle and then try out each one of these possibilities till you solve the puzzle. This is where combinatorial explosion is a problem and makes it virtually impossible for LLMs alone to solve the ARC-AGI problem set.
Final thoughts - this is an exciting time in the technology world. I think we are on the cusp of inventing new techniques in AI that will make human beings much more productive. However, I don't think that AI and even AGI will ever match human creativity. After all, a Campbell soup can was a Campbell soup can, till Andy Warhol presented it to us - then it was brilliant art. I can't imagine AI being creative like that.
I enjoyed reading your perspective on LLMs, Jagane Sundar. Like you, I’ve been skeptical about AGI, often dismissing it as merely "an extremely sophisticated GitHub code search tool," as you aptly put it. Lately, though, I’ve been making an effort to keep an open mind—focusing on what these models *can* do rather than dwelling on the puzzles they can’t solve. For instance, I found the Nature paper by Google DeepMind researchers particularly fascinating. It describes how they used LLMs to solve a 20-year-old open problem in extremal combinatorics. The work, titled *Mathematical discoveries from program search with large language models*, struck me as a highly creative application of this technology.