Juha-Matti Santala
Community Builder. Dreamer. Adventurer.

What to do first with a legacy project?

Adam Hill asked us in Mastodon:

Let's say, hypothetically, you were given sole ownership of a core piece of infrastructure at your $dayjob. It's so integral that if it breaks, most of the public-facing sites will also fail. And some backend processes as well. Also, the original authors are long gone. There is a readme, but no other documentation. Minimal code comments. And there are 0 tests.

with a poll with options of

  1. Write some documentation
  2. Write some tests
  3. Start praying
  4. Cry

But the question kept lingering on my mind throughout the week as I went back to work. This time, I wasn’t working on a legacy project but I’ve had my share of them in the past. I kinda like them but only if there’s actually time to make proper improvements and not just rush new features into a codebase that might break at any given change in unexpected ways.

Docs or tests first?

29% of people answered documentation and 74% tests. Praying and crying is a given regardless so I’ll focus on the main two today.

As the very first thing, I would start with documenting how the deployment works, in detail.

With such a integral piece of infrastructure, that would buy me a lot of peace of mind. Once I’m comfortable in knowing I can do deployments, database migrations, rollbacks and whatever is involved in that process, I’m more confident in starting to make improvements.

Often the legacy projects are the ones with a lot of manual deployment, undocumented steps so capturing whatever I can learn about it from people around me is crucial.

Writing tests can be difficult if you have no clue how the project works and what it does. When considering tests or documentation of the codebase as next steps, I’d pick documentation. In the past, I’ve printed out code of a legacy project on paper, pinned it around a meeting room and walked around with markers to gain a better overview on how the different parts run.

Tests are fantastic and would be my next addition but they are also written based on assumptions of how we think the project should work. In a legacy project, there are often cases where this process exposes bugs but you can’t be quite sure if they are actual bugs or if what you assume it should do, is misinformed.

Jumping into writing tests without first understanding and documenting the system can cause confusing moments. In reality, the process would probably be a lot of documentation and tests written in parallel: learnings from one will feed into creation of the other.

One thing I’m sure of: the importance of making small incremental changes and committing them back to the source cannot be overstated. Whether it’s about refactoring a not-so-legacy application or starting to work on an old and forgotten project, trying to fix everything in one go is destined to fail. And writing tests, any tests, before changing any piece of existing code is another winning strategy.