The delegation problem
I like to make small applications quickly with AI,1 in part because I can make these applications do one thing well. Like many others, I'm also trying to make these applications integrate well with agents: with my recipe site, for example, I can ask Claude to take arbitrary information, figure out how to structure it as a recipe, and send it off to my recipe repository.
At least implicitly, this approach suggests a vision of the future of computing that is mediated by agents plugging into many focused, domain-specific systems, and knowing when to interact with each.
I'm enjoying my first fumbling steps toward that future, but that last part--knowing when to interact with each--is harder than I realized. It is hard enough that I fear it will sink the whole vision, at least as a vision of mainstream computing.
The problem of delegating to one of many tools (I'll call it the "delegation problem") is very hard. Search engines have invested heavily in it, because they want to route you to the right subsystem for your query; they don't always get it right. The wake-phrase problem ("should my voice assistant handle this?") is a structurally simple version of it, and we see lots of false positives and false negatives.
With the tools I have now, I see lots of delegation mistakes. Here are a few:
- When I tell Claude to "check the logs," it sometimes opens up Chrome when it would do better to use the AWS CLI.
- I'm both a developer and an API user of various sites, and AI tools sometimes use developer-specific resources when I mean it to use the API.
- Asking AI to find a TODO sometimes sends it to the GitHub CLI when I mean it to look for code comments, and vice versa.
Instances of these problems are often preventable, and surely both I and the AI will improve at communicating with each other. But these problems are occurring in a relatively easy instance of the problem:
- I could have more than a dozen or so delegation targets available. (Some people have, and imagine futures with, many more.)
- Those targets could each want to do many things. (I don't have email or Google Drive MCP servers enabled; things like that would be relevant a lot.)
- My system is fundamentally cooperative: my tools don't fight too hard for Claude's attention, in part because I built many of them. I'd imagine that future tools will be more aggressive.
Consider the problem of setting the correct default program for a given file type. This is a delegation problem, and a vastly simpler one than determining search intent or AI tool invocation. Relatively few users of computers can address even this version of the delegation problem in a way that is not frustrating and also compatible with using many independent pieces of software. This is in part because competitive dynamics (e.g., browsers asking to be made the default) complicate it.
Perhaps there is a good solution to the delegation problem. I fear, however, that the likeliest solution is simply the abandonment of the agents-calling-many-tools paradigm. Earlier I referenced the Unix philosophy of doing one thing well; it is no accident that Unix has few expert users, and most people prefer to manage by learning just a handful of commands.