Time estimates in software
Time estimation in software has always been hard, and it's getting harder. It's quite hard to study this systematically, so even estimating how bad we are at estimating is hard.
It's a mistake, I think, even to hope for solid data about time estimates:
- Sometimes teams don't even fully define what they're estimating. More than once, in a scrum-style planning meeting, I've asked whether I was supposed to estimate the mean, median, or something else; this was always considered a very strange question. Some teams even have policies against discussing the units of estimates; you're just supposed to get the sense of what N points is supposed to mean, without knowing whether that's days, hours, or whatever else.
- Even when you know what you're estimating, the task is often specified loosely enough that work can be added to or subtracted from it.
- Even when the task's completion criteria are very clear, there are often more and less robust ways of satisfying those criteria.
- Sometimes the task and criteria are, implicitly or explicitly, negotiable after the estimate has occurred.
- And often, you don't really know whether the criteria have been satisfied until long after the fact.
These point to a more fundamental issue: the purpose of time estimates--whether or not they're scrum-style estimates, and whether they're estimates of small tasks or long projects--is often not primarily about the estimation. Other purposes include:
- Giving teams a chance to talk about techniques and challenges relevant to projects;
- Using large estimates to mark a project as important;
- Figuring out which cross-team dependencies will be necessary, and creating ways to say "please help us with X because it's supposed to be done by day Y"; and
- Negotiating over how hard engineers will need to work in the near future.
Meanwhile, generative AI is making all of this harder:
- Nobody has long experience completing tasks with LLMs, and even if they did, it would still be hard to predict service outages and the details of model improvements.
- The LLMs themselves are amusingly bad at estimating time (for now, at least);
- It's often better to solve a different shape of problem, or at least to do tasks in a different order or at a different level of generality, when you have AI to help you.
- When you're using AI, it can be tough even to know when you're done.
I don't know what the future of time estimation will be, especially because "time estimation" describes so many different processes with so many different purposes. I do, however, suspect that current estimates--at least insofar as they're estimates and not mandates or whatever else--are very unreliable.