A straightforward AI optimization
A while ago I made a simple hook for monitoring LLM tool calls.1 The data it's gathered suggest that LLMs default to opening whole files instead of finding ways to read only parts of them.2
I fear this will often pollute their context, because in an average file, most material will be only marginally relevant to most LLM readers.
Why don't we try making files way smaller--e.g., with only one function or a small group of functions per file?
- It would help address the context problem above.
- A main reason for longer files is that humans don't want to jump around files, but humans are going to be reading the code less and less (and when we do, IDE navigation is getting better and better).
- Current one-thing-per-file conventions (e.g., for React components) are broadly accepted and work pretty well.
- In cases where separating functions into different files causes dependency problems, that usually indicates that the functions have underlying design problems.3
There are also reasons not to do this: AI is getting better at managing its context and more powerful generally, and the surrounding material could be legitimately useful. Moreover, there might be some better way to indicate code boundaries to AI than by using separate files. On balance, however, I conjecture that it would help, especially because it's so safe.
A recurring theme here and elsewhere is that AI will, and should, change the structure of everything from our code to our patterns of work. Most current AI usage is doing the same things faster (and hopefully better), but we should look to see where it can profitably do different things. I expect many of these changes to be much more drastic than "let's make more files," but this could be a small representative restructuring, and one that potentially enables more significant changes.
This began as tool for monitoring file-opening events, but I soon asked it to simply log all the tool calls.↩
(E.g., with
grep -A -B, which I imagine has exploded in popularity in the last six months.)↩A simple, common example is when there is a cyclic dependency among functions in a file. Even when the language allows it, this usually indicates an organizational problem.↩