Unit Testing Anti-Pattern: Data Transformation Tests

This series goes through anti-patterns when writing tests. Yes, there are and will be many. 
TDD without refactoringLogic in testsMisleading testsNot asserting
Code matchingData transformationAsserting on not nullPrefixing test names with "test"

Should we write tests for everything?

Absolutely not. Tests that do not help us, cause us more pain while writing and maintaining them are waste. Here’s a common example I see all the time. We have code that translates data from one object or structure to another. Like this code that takes a Person and transforms it into a Student object:

Very simple code. Also usually happens to be a patch-up code. It happens a lot when you have a working version of the code, for years. It’s working faboulesly without tests.

But now in v3, we need to add some functionality, and as part of the new design we need a subset of the information or some kind of translation of it. It sometimes includes logic, but a lot of the information just moves around fields.

Now, we can definitely write tests that check that the transformation was correct. Fill out a Person with our on data, then assert it got to the right place in the Student object. In fact, data transformation tests are the easiest kind to write, since the data doesn’t have any dependencies.

Easy test to write? Increase my coverage for almost free? WIN!!!

Hold your horses, cowboy.

You’ve written a test, and it passes. What will cause it to fail in the future?

What kind of person are you?

Are you a person that compiles everything every a few minutes, just to make sure everything is still in order? Sure you are.

You’ve learned in the past that red squiggly lines, or compiler errors are an early warning that will save you time later.

That’s true for tests as well, they are another layer of early warning. Only they don’t work at the compiler level. Instead, they check the logic in the code. That’s why we tend to write test for code with conditionals, forks in the code, etc.

Now, going back to our data transformation: If there’s no logic in there, when will the test tell us something is wrong? If the answer is never, that test is pretty useless.

There are a couple of example where we can mess up, that tests can catch and I’ll use the code example above to categorize them:

  • Mix up the age and street fields.

If you messed up different types of data, the compiler will tell you, or other tests start failing. Switching completely differnt types of fields is going to raise a red flag somewhere else, so tests for these fields are probably wastefull – the risk of the problem getting through and not identified is low. We’re left withthe maintenance costs of the tests. The tests will probably pass for the rest of their lives, so they just incur time instead of helping.

  • Mix up country and street fields.

This is a bit more tricky, as the chance of catching the error without tests becomes smaller. Compilers won’t catch it, since the fields are the same type. The semantics are important, and without unit tests, we can catch those with code reviews. However, higher level tests (integration, end-to-end) probably will catch them. If the impact of the problem is low (for example, we keep the address details, but communicate through email) we can skip the unit test in that case too for the same reasons as before. (And ponder the question – why do we need the address for? But that’s for another post).

  • Calculated the isAccepted incorrectly

That’s the riskeist one, since we got business rules riding on it, that may not be caught by other tests. We are transforming data that is important business information, andwe want to make sure that is transfered correctly. I might write a unit test for that, but then ask the question again: If this is so important, why don’t we have high level tests for it? Aren’t there cheaper ways to make sure we transfer the data correctly? (Like, you know, code review?)

So you probably don’t need unit tests for data transformation code. Unless there’s important logic in there.

What do we do then?

We have a couple of options (that you probably have thought about already):

  • Write a unit test the logic transformation (and leave out all the direct data passing uncovered)
  • Seperate the logic into another class or module, and test it there. Then the data transformation remains pure and clean, and doesn’t need any tests.

We are sometimes incentivized to get more coverage and we look at code that transforms data as quick wins. In fact these tests won’t help us in the future, and will just cause code lock-in, prolong the test run time and cause us maintenance repairs. As long as you have other ways of dealing with it, data transfomration tests you can probably skip.


Leave a Reply

Your email address will not be published. Required fields are marked *