On tokenmaxxing

18 May, 2026

"Tokenmaxxing" is, roughly, attempting to use as many tokens as possible. Much of the conversation about it focuses on questions like these:

Should we really be doing everything with AI?
How much are people doing artificial work because they're being evaluated on their token usage?

These are good questions, but we should also be asking:

To what extent is token-count a good metric for getting things done with AI?

The connection between token usage and getting-work-done is a lot looser than many people seem to be assuming. This is for a lot of reasons, including:

Token usage is very sensitive to how often you /clear, because overall token usage is approximately quadratic in session length.¹
If you initialize MCP servers or load long AGENTS.md files with every session, you'll use a lot more tokens in ways that on average get little (or no, or negative) work done.

People discussing tokenmaxxing generally understand that the connection between tokens and output is loose, but they seem not to understand quite how loose it is. Some of my most intensive and productive AI-assisted coding sessions come when I'm making many smaller plans, addressing many small issues, and /clearing a lot. These often consume fewer tokens than leisurely (and useful, but less productive) planning sessions over lunch.

Approximately! These asymptotic-behavior claims have a lot of caveats, and there are more and subtler caveats over time as the tools advance.↩

#generative AI #productivity #sociology of software #software