Python task manager from scratch, part 27: Using a real database

I have promised, repeatedly, that it's no problem that we're using a flat text file as our database. Now I'll make good on that promise by swapping out that text file for SQLite.

Why now?

  1. We've got a (very) primitive prototype working.
  2. Ad hoc serialization and deserialization are starting to cause problems.
  3. It's starting to bother me.

That last is slightly more serious than it might appear. First, programming effectively depends heavily on your reserves of energy and gumption. Things that sap those are ipso facto bad. Second, as you get better at programming, the correspondence tightens between what bothers you and what is actually bad.

Why SQLite?

  1. It's amazing technology.
  2. This application is unlikely to hit its weaknesses.
  3. Python supports it beautifully in its standard library.
  4. It's the next-easiest thing to implement.
  5. I (am not an expert but) know how to use it.

Moving fast matters a lot. There are big changes here, but they took a total of perhaps an hour to implement (and really I should have done it faster).

SQL has been explained by other people better than I'd explain it from scratch. You should know:

  1. What SQL is (here's Wikipedia; also the Python docs have illustrative examples)
  2. The basic idea of a declarative language (SQL is declarative)

Some notable features of this migration are:

  1. I mostly swapped out repositories in the test file (cleaning things up a bit as I went). Architecture Patterns in Python emphasizes this important lesson: if you construct your interfaces and tests appropriately, you'll often be able to test new implementations with the same tests.
  2. I modified the .gitignore file so that the database (and anything with a .db file extension) are not committed to version control. This respects the basic distinction between code and data; prepares us for a future where several clients read from a given authoritative database; and prevents anyone with access to the code from also seeing all of our tasks. (Properly securing this application is a large project, but not keeping all of our users' tasks in plain text in a public GitHub repository is a good first step.)
  3. I added a README.

And this work makes certain exhortations relevant:

  1. There is no reason to have the flat-file repository around any more. Often it's useful to keep simple implementations around for testing, but SQLite is every bit as nimble and functional for testing as a flat text file is. So burn it down and delete it. Vestigial code is much worse than intuition suggests. It's confusing: to you, to your tools, and to future developers. It causes bugs. Code is a liability (in the flamingo book's phrasing); other things equal, get rid of it. Also remember that you are using version control. Deleting it is not like throwing something in a dumpster. The old version is there if you want it. I like to have a README section with reminders about deleted functionality. (But this is mostly to make it psychologically easier to throw away--although I do sometimes go back to look at these vestiges, I'm not sure the README notes have ever made this process more efficient.)
  2. The set_tasks() method on TaskRepository was a mistake. It is worth the effort now to delete it. (See the last installment for notes on why extra methods on your interfaces are dangerous.)

These are both more important than they might seem. It's easy to recite slogans about how programming is, centrally, a matter of managing complexity. Actively fighting against quadratic complexity is a lot harder: it requires a combination of clear thinking and professional discipline.

The Web interface runs just as it did before--perhaps a bit snappier. The tests pass. Take a look around: we now have a functioning app backed by a production-quality database.

Here's the current commit in the veery/ repository.


Next post: Python task manager from scratch, part 28: Taking stock


Home page