Refactoring Kung Fu – Part II

Gil Zilberfeld explains the principles of refactoring legacy code
Standard
In this series we're taking a look at how to refactor un-refactorable legacy code.
Part IPart IPart III
Part IVPart V

Part VI


Last time I gave you a sneak peak at our PastaMaker. You can check the whole project here.
Let’s take a closer look.

The main logic is in our PastaMaker, so focusing on testing it makes sense. Most of the logic is there, and so are the changes we need to make. We’ll need to do that very carefully, as the system is not test-friendly.

As you can see, our PastaMaker has one main method, “cook”. It takes two parameters, analyzes them, and based on their value does all kinds of things. The “all kind of things” are in private methods, that can call outside the class. I’ve left most of them empty for our example. In the real world, the content of some of the private methods was inlined in the main method. Also in the real world, there are a dozens of conditions and dozens of lines of handling code in that class, so the scale is larger. The principles that we’ll use still apply though.

Do you see the if-else-if-else-if logic? That’s the main problem. It’s not only hard to understand; the order of the clauses is important. It’s hard to change it without risking changing the functionality. Also, what we do in them, can impact subsequent conditions.

How did we get here?

No one plans for this kind of code, but you can see how it gets built, right? You don’t want to touch the “already working” cases. You find the least risky, most probable place to add your new case, close your eyes, commit the code and run away screaming. Rinse and repeat this process for a few years, and we get to this structure.

Our PastaMaker is a great candidate for refactoring. There’s also something that we can see from the code that can help us: We can identify the different cases (but are they all the cases?). In the original code, the conditions are similar – the inputs to the methods are analyzed in the same way (Strings instead of enums). However, the combination number of cases can be astounding.

Our approach to refactoring this code needs some discipline, and some guiding principles.

  • Put (limited) trust in our tools

Our tools (in this case my IDE) will help with automatic refactoring. Depending on the tools we use, we can usually trust renaming, extracting, and sometimes even moving methods around. We can also rely on identification of “unreachable code” after a return.But we don’t trust our tools totally. For example, to invert if-else-if clauses automatically is dangerous. The semantics of our application is not known to our tools. Our trust has limits, and tools can take us so far.

  • Minimize risks

While I’d like the IDE to refactor everything for me, when I move code around, or introduce interfaces, and make other changes my tools can’t do reliably, I need to trust myself.
If this is a critical component, like the PastaMaker, I don’t.When we refactor critical code, we do it with a pair. Or in a group. The process may be long, and we’re going to make mistakes. A partner can help identify the errors, pick up on missing cases, and in general keep us moving forward. We minimize risks where and where we can and that includes another set of eyes.

  • Refactoring means NOT adding code as we go

This is very important. We’re already knee-deep in code we don’t understand. Even worse, we may think we do, and that may not be true. Until we finish making change, and have more confidence the code works, when it has tests around it, we don’t add any new functionality. This is, by the way, one of the genius pillars of TDD: separate the functionality from the refactoring. Only this time, we’re going the other way around.

  • New code is only added after, and with accompanying tests

Once the code is ready in terms of form, and is stable with tests, only then we can add new functionality. And use test-first.

  • Leave the code better than how we found it

We don’t refactor for nothing. We want better code we can add new functionality to. That means, better, simpler, factored code with tests. Oh, and since we’re doing this in a pair, at least two people agree that the code is better than the original.

Principles are not enough

So what’s our plan of attack? The following techniques help us.

  • Define the boundaries of our system-under-test

We need to decide where our system starts and ends. We’re going to move code inside it, inject inputs to one end, and sample on the other end. But we start with the boundaries.

  • Build adapters around those boundaries

Once we’ve decided where the other end is, we’ll need to create adapters to mark our territory. Those adapters will call the original code, and can help us decouple dependencies. In further testing we can use the adapter for mocking..

I’m pretty sure when you here “adapters” you think “interfaces”. We’ll use them, but that is a language construct. If we’re working in C, for example, we’ll need something similar, since there’s no “interface” construct in C.

  • Put as many safeguards in place as we’re refactoring
    We don’t have tests yet, but we will eventually. We need to think about all the cases we want to cover, and we want this comprehensive list early, before we touch the code.
    Later, we’re going to put information in the code that not only help our tests, but also to debug them when they fail. Feedback is our friend.

Next time, we start moving code around.

Leave a Reply

Your email address will not be published. Required fields are marked *