How to ramp up

When you start work as a software engineer on a new team, you might be told to ramp up in any of these ways:

  • Read documentation
  • Pick up a starter ticket
  • Have a teammate explain the system to you
  • "Just look around and ask questions"
  • Shadow a new teammate
  • Take tutorials about all the major software development tools at the company

I've tried all of these, and both succeeded and failed at actually ramping up. I'm now convinced that the best way to ramp up is to prioritize learning your team's system, and to do that by learning your team's data and how it flows through the system.

  1. Figure out what your team's most important data is

Your team probably processes and stores data, and this data is probably the core of what your team does. (Sometimes--e.g., for some R&D teams--this isn't true, but if you're in any kind of "normal" software engineering job, it probably is.) Figure out what it is.

  1. Learn how to access that data

This usually means figuring out which database holds that data, and how. Actually go look at the data. (Good luck with permissions, which are almost never both correctly managed and readily intelligible for a new team member.) Make sure you're looking at the real data, not an obsolete or redundant store.

Understand how every important aspect of the data is stored in the database. On the happy path, there is a single database table, and every feature of the data is stored in an intelligible and consistent way in a well-named field on that table. This happy path is rarely achieved--your job is to figure out how the data actually looks in storage.

  1. Learn the data flow

Find the answers to these questions:

  • Where is that data most commonly generated?
  • Are there any other ways the data is generated? (Don't forget backfills, administrative overrides, and various kinds of triggers in workflows.)
  • Where in the code base is the code actually written to the storage?

If there is a most common input source for this data, track the code from its origin to the persistence layer. If you can actually generate data in a test environment and verify that you are correct (by, e.g., inserting log lines or making trivial changes to a checked-out version of the code), that's even better.

  1. Build out your knowledge

At this point, you should know the very happiest path your team's data can take: the most important data, from its commonest input source, to its main data store, to its most important outputs. Think of it as the best possible skeleton. Now, build it out:

  • How is the data monitored, displayed, and alarmed on?
  • How does the system manage scale?
  • What other subsystems produce and consume this data?
  • What other data is there?
  • You probably discovered some strange features of the data's structure or storage. Why are those the way they are? (Usually it is not because someone was incompetent. Mistakes might have been made, but there were probably contextual reasons you should know about.)
  • How is all of this documented? If you used documentation to help get this far, can you improve it?
  • What else is there to know about all the subsystems you've seen so far?

Why this is better than the standard advice

This approach overlaps with, but is very different from, most standard advice.

Why not start by reading documentation for a while?
  • The documentation is very likely to be outdated, confusing, or inaccurate.
  • Even where the documentation is intelligible and accurate, it's very hard for someone who doesn't understand a lot of context to learn from it without learning in a more active way.
Why not start by picking up a starter ticket?
  • Starter tickets are often chosen because they can be completed without knowing too much about the codebase or the data. For the same reasons, they often don't teach you much about either.
  • Starter tickets are often drawn from peripheral parts of the system, which are less important to know.

To be clear: doing starter tasks is great, especially as a way to learn unfamiliar tooling for writing, submitting, and incorporating code. I recommend it. But they're overrated as a way to learn about the system you're working on, and it's deceptively easy to complete starter tickets without getting much closer to being a fully productive team member.

Why not start by having a teammate explain the system to you?

This is a great thing to do! But:

  • It's very hard to retain a lot of this information without already having a bit of context. Several smaller whiteboarding sessions, as you start to have questions (as you track the happy path of the data), is in my experience a lot more efficient.
  • Teammates make mistakes! This is, after all, another form of documentation, and learning from documentation has the problems I sketched above.

My second-favorite ramp-up method

My second-favorite ramp-up method, which can be done in parallel with the above, is to read every single pull request (or whatever your analogue of a pull request is) on your team. Figure out basically what is going on in the PR and why it is necessary. Knowing a little bit about where the current work is (and where the current problems are!) is likely to help you far out of proportion to the time this takes.

12/23/2024