LLMs are the closest we’ve come to AGI and the most important invention since the Internet. In terms of economic productivity, the most useful LLM products are ChatGPT and GitHub Copilot, respectively a freeform AI chatbot and code autocomplete. But they can’t carry out complex tasks: very soon you run up against the limits of the context window. LLMs are like a tireless 120 IQ polymath with anterograde amnesia who forgets everything after ~10m of activity.
Agents are an attempt to get around this: they are software systems that string together LLM calls to accomplish complex tasks. In the same way a patient with anterograde amnesia might use software tools to compensate for their deficit in working memory, agents use “classical” software to orchestrate LLMs to accomplish complex tasks across multiple completions.
A lot of agent architecture resemble cognitive architures in the style of Soar and ACT-R: they have an agenda of tasks, a long-term memory (often using a vector database) with RAG, and other components. It’s actually very interesting how programmers have basically reinvented cognitive architectures from scratch, often knowing nothing of the field’s existence.
And yet they don’t really work. There isn’t a software development agent that you can download, point it at a repo and give it a high level description of a change, and come back and hour later to a ready pull request. There’s AutoGPT, babyagi, gpt-engineer and I’ve yet to hear of any of these doing anything impressive.
And I wonder why. If AI is going to increase human productivity, it has to be able to solve complex tasks. Copilot is great, but it’s just a faster horse: I can write the same code, but faster. An agent that can write code on my behalf lets me move up the value chain and do more worthwhile things. If LLM agents can work in principle, then understanding how to make them work is crucial.
One factor is that LLM agents are born of necessity. If you have the skills and resources ($$$) to pretrain an LLM, you’re not building agents, you’re building something like Yann LeCun’s differentiable neural architecture, where the model architecture is the agent, or, at the very least, you’re pretraining an LLM where the shape of the training data tells it is’s going to be a submodule of a larger mind.
If you’re just a simple country programmer with an OpenAI API key, you can’t innovate at the model layer, you have to innovate at the API layer. So you build a cognitive architecture with the LLM as the central executive. The innovation is the architecture: the flow of information and the processes that build up the prompt, while the LLM itself remains a COTS black box. And so the people most qualified to build effective agents are working further up the value chain.
People use ChatGPT and see that a single LLM completion can be incredibly general: in a single message, ChatGPT can write poetry, it can write code, it can explain a complex topic, extract LaTeX from a photograph, etc.
And so people think: I want to build an agent that is equally general, therefore, I should build a general architecture. And they go on to build an architecture that looks like every diagram from a cognitive science paper: there is a small number of very large, very general components. There is a long term memory, a short term memory, a central executive, sensors, effectors (function calls). It’s very general, very abstract, and very underconstrained.
These architectures never work. Compare Copilot: there’s a huge software layer that converts file context and repository context into a prompt, to generate a completion. It’s not general, it doesn’t have a general-purpose vector database for RAG, or a tree of recursively broken-down, priority-weighted subtasks. It’s specific and overconstrained. It can’t write free verse, but it improves my programming productivity by like 40%. Because it does the one thing.
Maybe LLMs work best as “magic functions calls”: performing some narrow, concrete, specific, overconstrained, supervisable task. Not writing a program but writing a single function.
A compelling analogy for why LLMs seem to be unable to accomplish complex tasks: if you have a lamp running at 90°C, then you can have a thousand such lamps pointing at an object, and it will never reach 100°C.
It is well known that LLM-written text has lower entropy than human-written text. So maybe there’s something analogous to a thermodynamic limit, where the complexity of the LLM completion and the complexity provided by the architecture taken together are insufficient to reach criticality and get self-sustaining output.
When you watch the transcripts of LLM agents failing to do a task, it’s often one of three things:
- The agent starts going in circles, visiting the same goals.
- The agent gets “lazy”, takes a high-level goal and splits it into a subgoal that’s just “actually do the work”, and when it gets to that subgoal it does it again, in an infinite recursion of procrastination.
- The goals or actions are vague and meandering, and rather than looping, the agent wanders around talking to itself, accomplishing nothing.
And these failures remind me of Stephen Wolfram’s complexity classes of cellular automata:
- Class I: the CA converges on a homogeneous state (e.g. all cells die out).
- Class II: the CA converges on a repeating/periodic state (e.g. blinkers in Life).
- Class III: the CA diverges, chaotic behaviour.
- Class IV: complex structures that persist over time.
Within a single completion (i.e.: on a simple task), LLMs have human-equivalent performance. Strung together into an agent, most agents, on most tasks, are Class I to Class III: they do nothing, or do the same things over and over again, or do random things. Something is missing to reach criticality.
I don’t have one. There’s a number of possibilities that fit the evidence:
- LLM agents are like perpetual motion machines: forbidden by some yet-undiscovered fundamental theorem in comparative psychology.
- LLMs themselves aren’t good enough yet, and the next generation of models will work.
- General architectures are intrinsically doomed because they’re too general, and developers should focus on narrow areas of applications, with more complex software architectures using LLMs in narrow places.
- Simple architectures (with a single-digit number of discrete modules) are too simple, too abstract, too vague and underdetermined. Architectures have to be a lot more involved and a lot more complex to show complex behaviours. Less elegant diagrams, more like biology.
- The LLM-classical software boundary is where things break down. Agents need to be pretrained in an agent context, with the agent as a whole being a single end-to-end differentiable system.