Migrations with AI, again
I recently conjectured that AI will make migrating software much faster and much safer--in some cases it will be so different that it won't feel like a "fast migration" as much as a new and different software process.
Having finished the migration that prompted those reflections, I'm even more confident in these claims. Again, the migration involved moving from a non-relational to a relational database:
- Approximately one million items across a handful of tables, and
- Significant related architectural simplifications, on a
- Low- but not zero-traffic system.
Many migrations are larger and more complex than this, but many are not, and I don't think things would have been meaningfully different at 2x to 4x this scale.
First, my views about migrations have changed. I suspect that:
- We overemphasize how intrinsically unpredictable most migrations are. New systems do bring unpredictability, and that uncertainty does compound super-linearly (in the sense that N sources of uncertainty creates much more than N times as much total system uncertainty). But many tools are well-known, many combinations of tools are well-studied, many mutually reinforcing kinds of tests and checks can be written, and parts of migrations can often be done independently.
- Almost the same claim, from a different angle: migrations are limited by available human effort more than we often admit. It is one thing to say that you're doing everything you can to get it right, and quite another to actually make sure all your migration scripts are idempotent, test your scripts properly, govern the whole system via CDK or some other infrastructure-as-code tool, execute the migration in a test environment, take reasonable steps to make that test environment more like production, and so on.
- Migrations are highly sensitive to their executors' ability to maintain focus and keep doing annoying work to high standards for a long time.
Happily, AI is very good at doing lots of well-understood, annoying work quickly. The next time you're doing a migration of roughly this shape, consider:
- Can you shore up the testing of the source system?
- Can you get govern more of the source system's infrastructure as code?
- Do you have tons of logging in the target system? (AI is very good at parsing lots of logs, and the cost-benefit of "excessive" logging is very different (i) in a migration (ii) when you have AI to help you.)
- Every time you query a database, are you (i) constructing the query by means of a helper function, (ii) testing that helper function, (iii) logging the query before you make it, and (iv) logging relevant information about the results of the query?
- Have you written tests to verify that the broad shape (e.g., the cardinality) of the data is close enough to equal in the source and target systems?
- Have you written tests to randomly sample data from the source system and compare it to analogous data from the target system?
- There will probably be a critical cutover period: are you sure that everything you need to do then cannot be done before? (Can you do a first migration of the data, with a plan to fill in the difference during the cutover or later? Have you configured all your new infrastructure and ensured that you can observe and access it as necessary?)
- Have you written the full migration plan as a script that is intelligible to AI?
That's only a partial checklist, but this sort of preparation can greatly de-risk a migration. AI is more than happy to help with all of it.
Meanwhile, there are aspects of a migration that are still quite hard for AI, or at least for AI without customization:
- High-level "am I missing anything?" questions;
- Questions of risk and of what data actually needs to get migrated;
- Making sure that there really aren't any dependencies remaining on the old infrastructure.
Here, AI is a useful partner but less likely to be trustworthy. Or, at least, less trustworthy for now. As with all things in software right now, it's changing quickly.
Migrations, besides being an important subject in their own right, are a useful case study in applying AI. However much case-by-case reasoning and subtle judgment are required, they also tend to require a lot of work for which AI is uncontroversially well suited. We all need to learn to recognize situations like this.