Integration Testing with Spring – Mocking

Gil Zilberfeld explains how to configure spring with mocks for integration tests
This is a short series of how to use Spring in integration testing and unit testing.

Testing a REST API

Let’s continue where we’ve left off – multiple configurations for integration tests.

We use different configurations when we need to inject two different sets of objects. For example a real one and mocked one of the same type for different integration test purposes.

Let’s say we have a REST API that calls some component logic, which then calls the database through a DAO (data access object).

In the first set of test scenarios, I wan’t to mock the database, and test that the logic component works correctly (similar to an isolated unit test, but through the API). In other scenarios I want to make sure the entire flow works, up to the database. So I’ll need two seperate configurations – one that injects a mocked DAO component, and one that injects the real one .

Note that managing configurations takes some work. We usually don’t have a configuration per test class, so that means we create a test configuration that serves multiple integration tests. The configuration classes need to be maintained and kept light, so they will fit every consumer.

Configuration pitfalls

Let’s look at the configuration from last time for the mock.

Spring injects any object once on startup by default. That means that integration tests share these mocked instances. That’s an issue we need to understand.

The MockInjectionConfiguration mockStudent that is injected, already has the behavior set for it. When the first integration test runs, Spring injects it as it is written.

But when a second integration test tries to set behavior by using Mockito.when on getName(), it will add the behavior, not override it. And when we’re using Mockito.verify(), oh the laughs we’ll have…

The solution to this is a conventional way of writing tests. A better way to do it to define the configuration like this:

The injected object is a plain basic mock. Let the integration tests define their own behavior. That means that any integration test can assume that it’s starting from scratch.

But assumptions are for fools. We need to make sure the assumption is correct, and that means that we need to reset the mock manually. For example, we can use Mockito.reset() in a setup method:

Without this, mocks can seem to behave erratically. As in, they behave exactly as we tell them, but not how we expect.

But even that maybe too much work for some of us.

If we’re lazy and want to avoid that, Spring Boot can do this for us, if we’re declaring the injection in the test @MockBean instead of @Autowired. With @MockBean we don’t need a @Configuration class to inject the mocks. Spring automatically injects a fresh mock of the object with every test. For further setup for the mock you can either use the @Before method, or the tests themselves.

One more thing to remember: Using beans in the production code makes them easy to use them in integration tests. That ease-of-use comes with the price of speed.

Even if your tests don’t use mocks or injections, the code-under-test may do that. Spring is slow in ramp-up and in run-time. If you write unit tests for the code, make sure it is free of beans, and inject the dependencies manually. “Regular” unit tests (the ones that don’t use Spring for injection) run much more quickly. It also makes sense to locate them differently from the integration tests so they can run separately.

In the final part, we’ll see how we set up a test for a REST API that calls a mock internally.

Integration Testing with Spring – Configurations

Gil Zilberfeld talks about configuration for mocking in integration tests and unit tesitng
This is a short series of how to use Spring in integration testing and unit testing.

Testing a REST API

First, a couple of words about Spring in general as a dependency injection framework. One of the best things in Spring is its simplicity of injection. Regardless of where you are, you pop an @Autowired annotation on a class variable (which could be the test class), and it’s ready for injection. Since it’s injecting the same instance regardless of class, setup for mocks is easy. And regardless of how many layers away the object will be injected, the test has access to it, and doesn’t need to pass it between layers, which requires less code. That is always a good thing.

On the registeration side, with @Configuration classes, you can configure exactly and easily what to inject. Spring Boot allows you to do a bit more with easily bootstrapping a project, along with a bit more help for unit tests and integration tests.

All these make it easy to use (and abuse) writing integration tests. Let’s take a look at a couple of scenarios for integration testing.

For our first example, we have a REST API, that inside it calls a dependency. In integration testing APIs like this, we usually go all the way to the back, but even then, we might want to mock something at the back end, mostly for controlling behavior of our integration test.

First, I want to inject a real object I create. In a @Configuration class I put under the test folder, I create this configuration:

Then in the integration test I’ll use the @Configuration class:

That is more of an explanation how Spring works, and obviously, not that useful. In actual integration testing scenarios I use this for injecting POJOs in internal layers, or inject objects that will call dependcies I mock.

If I want to inject a mock instead (let’s say Student wasn’t a POJO), I’ll use a different @Configuration class:

And then I can use it in a test:

Now that’s in the test, I can add mocking behavior.

That’s the basic @Configuration stuff. We’ll continue the discussion on configurations and mocking next time.

How TDD Can Conquer The World (And Why It’s Unlikely To Happen)

TDD Is unlikely to win

He said: “I asked all my friends, and none of them likes TDD”.

This one I haven’t heard before, although I suspect I should have at some point. Like any practice, TDD has a social side.

I told him to find new friends, which he conveniently ignored. He then continued: “We ,developers,  want to move forward, build software. TDD slows us down”.

That’s true, TDD slows us down, because it forces us to think, which is important in software.
Why is that something that needs explaining? Over and over again?

Wash Your Hands

Uncle Bob uses the “doctors washing their hands” metaphor for explainin TDD. Today it’s common place for doctors to do wash hands before handling patients, but was that always the case? It’s not like all doctors, all over the world, one day switched to washing their hands. It was a continuous process, that probably reached a critical point, when doctors realized that it’s good for their patients.

Of course, there was opposition to the idea. You’d probably hear “I need to take care of my patients, this hand-washing thing just slows me down”.

Yet hand-washing has crossed the chasm.
What would it take TDD get to a “washing hands” status? Thou must count to three.

Three shall be the number of the counting

  • Education

When people die, that’s a big impact. When people are saved, and live, and catch fewer diseases that too is a big impact. When we see the corrolation (or let’s say, we’re convinced that there is one), we see motivation to change. However, we don’t always see the impact of getting quality software out the door. So we need to be educated.

It’s leadership’s job to relate this correlation. Not only that, management needs to understand the business impact of technical debt, so developers make sure the code provides the needed business benefits. Those benefits also translate to regular working expectations.

  • Regulation

Next is regulaton. I’m sure hand-washing opponents wouldn’t just jump on the ship without proper external incentives. As in, “you do it, or you won’t work here again”.

What’s the chance of that happening with software practices? Pretty low. As I’ve written before, I believe the software business is going to be regulated at some point, and badly at that. I don’t see how regulating software development techniques will actually make them work. For example, regulated TDD? First the regulator needs to understand it, then it needs way to see developers are confirming to it. Coverage you say? Good. And then the gaming is on. I don’t see that’s happening soon, though but we’re getting close.

Regardless, external forces saying “do this or you don’t work again”, can have that affect.

  • Social pressure

That means swaying the crowds toward good development practices. It is taking place, although at a glacial speed, that is really hard to see. Without proper education and/or regulation, developers are incentivized to build quickly, regardless of quality. Without proper training that quality is part of the work, this ain’t gonna happen. Developers will continue to say “TDD slows them down”. The more people saying that, the less people will pick it up and run with it.

So the happy path to “TDD taking over the world of software” requires internal, external and social incentives.

Aligning stars is easier. Or maybe it will just takes a very long time. And we need to work at it, doing the hard work, day in, day out.

Actual hard work.

Nah, nobody likes that.

Unit Testing Anti-Pattern: Leaky Mocks and Data

Unit testing anti-pattern: Leaking
This series goes through anti-patterns when writing tests. Yes, there are and will be many. 
TDD without refactoringLogic in testsMisleading testsNot asserting
Code matchingData transformationAsserting on not nullPrefixing test names with "test"

Unit tests should be isolated from each other. That means that it doesn’t matter if they run in any specific order, alone or in a group, we expect a consistent result. If there’s a reason for failure in the tests it should be a change in functionality.

However, things get tricky if the code, or the tests have dependencies that we can’t get rid of. The tests are no longer unit tests by definition, but if they are valuable, we want to keep them. We still need to take care of the leaks between tests. The anti-pattern is not doing so.

Leakage is not just a beginner problem. In large organizations, as we start testing bigger flows, rather than small classes. When “other people” start writing tests for areas of code that were initially written by “us”. Our assumptions of what clean up means, may no longer work, or worse, not known.

When we encounter these symptoms, they usually points to organization dysfunctions. Conway’s law works in mysterios ways.

Let’s take mocking for example. In isolated unit tests, we create the mock, configure it, and it dies at the end of the test. Then we create another one for the next test. The tests take care of the clean-up.

The “taking care of” is not really the handling of the tests. It’s just instance management – if we’re creating the mock inside a test, it will just die at the end of the tests. If we’re creating it in the @Before method (or equivalent depending on the framework), the instance used in the last test gets over-written in the current one. In a language like C++ without automatic memory management, we may still see the same affect, although the memory doesn’t just go away.

It seems that we get isolation for free, since we’re not doing anything with the mocks for that isolation we crave.

Don’t get used to those freebies

Let’s say we use Spring for dependency injection. Now, we don’t create the mocks manually. They just appear from thin air. We assume that isolation is also taken care of, out of thin air. But if the mock is injected once, it carries its expectations and behaviors from previous tests unless it gets reset (e.g. use reset() in Mockito at clean up time).

Mocks are not the only things that need leakage therapy.

As we’re testing flows with external data – database, cache, files, the registry – we need to make sure that data doesn’t leak to the next tests. We need to clean up after the test has completed.

Using Spring’s @Transactional in tests works, but maybe not enough. What about setup data? That need’s to be cleaned before moving to the next test.

And still that may not be good enough.

What if the test crashes mid-way? or has some unexpected behavior? We need to be sure we clean up correctly in every case. By the way, being sure is not that simple – we need to know how all the code behaves, from now until forever.

The other option is each test makes sure it cleans everything it needs before starting. That’s as simple as the other method.

Some of these things can be automated – base classes that contain these behaviors, clean up scripts, etc. It comes down to “how the teams works”. It assumes that the way we write code and tests is known and understood by everyone. This assumption is hardly true in most teams, and gets broken over large codebases.

When codebases break, so do the tests.

The successful path to overcome these problems is knowledge sharing and practice policies. Design reviews, pair programming, code reviews – the things that help to create a working development process. If everyone finds their own way to clean up the leaks, it pretty much guarantees that someone, somewhere will assume that “it should work like this”. Or worse – copy the method without understanding what stands behind it. And then they start noticing weird test behavir.

Leaks should be stopped, but defining how is just the first step. The next is making these methods commonly used..


Unit Testing Implementation: The Plan

This series deals with the implementation of a unit testing process in a team or across multiple teams in an organization. Posts in the series include:
GoalsOutcomesLeading Indicators ILeading Indicators II
Leading Indicators IIILeadership ILeadership IIThe plan

So far, we’ve talked about the process itself, our goals and expectations, what to look for while we’re moving forward, and now it’s time we get to the good stuff.

How does an implementation plan actually look like? A good plan includes these elements:


Remember that when we start, we already have a core team, usually one, who learned the ropes all by themselves. While they can be great ambassadors or mentors, they are usually not trainers. They know what they’ve encountered, and that is usually much less than skilled practitioners and trainers, who’ve seen lots of code and tests.

The other teams, the people who start from scratch, need context, focus and the quickest ramp up in order to get started. In my introductory courses, I introduce tools as well as effective practices of testing – planning, writing, maintaining, working with legacy code, etc. In addition, I expose them to design for testability and TDD. The courses are hands on, so people can practice the different topics.

Environment preparation

Apart from having the tools available on the developer machines, we need a CI server that’s configured to run the tests and report the test run results. We’d also like to have project templates (maven archetypes, makefiles, etc.) available so people won’t need to start from scratch.

All dependencies (libraries, tools, templates, examples) should be available in a central repository. On day one we want people to start committing tests that are run and reported. We don’t want to have them bump into environmental problems and extinguish their motivation.


This are sessions (1-2 hours each tops) when an experienced coach (either external or internal), sits with one or two people and helps them plan test cases, write tests and review tests for things they are working on. This way we transfer the knowledge of testing, as well as starting to create conventions of “this is how we test”. We focus on code that’s being worked on, focusing on making it testable and proving it.

If you start out with an external coach, it would scale up to a point. The idea is to start with a small group that can later become the mentors for new people in a viral way. The ambassadors from the pilot stage can and should support that process.

Communities of practice

We want to continuously improve the way we test, discuss and share our experiences. As we’ve already discussed, there should be forums for discussing and practicing testing. That means we need scheduled time, when people are encouraged to attend, talk about what they did, and learn from others. Test reviews, refactoring together, learning patterns – these meetings breed stronger developers.

These COP meetings are opportunities to discuss the metrics and goals, adapt if help is needed. They are engines for learning and imprvement. They also send the message from management that testing is important. As time goes by, and less coaching sessions are needed, the COP takes over as the main teaching and mentoring tool.

There you have it. In the next posts, I’ll go through a case study of deploying a unit testing plan.




Leadership in Unit Testing Implementation, Part II

This series deals with the implementation of a unit testing process in a team or across multiple teams in an organization. Posts in the series include:
GoalsOutcomesLeading Indicators ILeading Indicators II
Leading Indicators IIILeadership ILeadership IIThe plan

We talked about management attention and support, and there’s more leaders can do, in order to help us make the process work.

Remember those leading indicators? They don’t collect themselves. If we think about those indicators as a feature, there are customers waiting for them.. Management and leaders are those customers.

Turns out the people who care about the metrics are the ones who have the power to facilitate their collection, demand the reports, analyze the patterns and ask for correction plans. Who knew?

The funny thing is the collection of the metrics is a leading indicator by itself. If the process of metric collection, anlysis and feedback is done, chances are the process will go well, since somebody’s supporting it. If it doesn’t (for any combination of reasons), people see that the process is “not that important to managers”, which quickly translates to “it’s not that important to us, and me”. Even if people care about unit testing, they care more about making sure they don’t get caught on other things that managemnet does care about. Safety first.

Embracing change

Even when the process does go well, an anti-pattern may appear: Sticking to the initial plan, rather than changing course over time, which we expect leaders to do.

An example of that is around our favorite metric:  Management asserts that coverage should increase over time. Since our leaders are wise, they don’t state a minimum coverage threshold, just an indication that coverage is growing. And, we’re not covering old code, just new code. Looks innocent enough.

What happens next will surprise you (not). To keep the coverage increasing, people are encouraged to add tests to their code. But since our developers are also wise, they don’t add code that unit tests don’t make sense for (like data transformation, or auto-generated code).

So the coverage doesn’t rise. Alas, if that’s most of their code they don’t get “coverage points”, or in some measurement systems, lose some of them. Remember safety first? Let the gaming begin. The developer may add tests that are ineffective (or even harmful), just to satisfy the metric.


The only way to make sure this doesn’t happen is through retrospection, analysis and feedback. I’ve already said that as stakeholders, leadership can make sure these take place. But should they do all that by themselves?

Here comes the next way how leadership helps the process: creating forums for the learners to learn further from peers and grow internal experts. We call those communities of practice. Where practitioners discuss, review, analyze and get feedback to learn from the success and mistakes of others.

The chance of these forums created, and continuing to take place over time, is likely to succeed with management support. In these forums, the discussion takes two forms: the tactical level – how to write better tests, learn refactoring methods, etc. But also at the strategic level, look at the process itself, the metrics, suggest course correction, and follow up with the implementation.

Now, the more authority we give the communities, the better. We like self-organizing and self managing teams. They will need leadership to be created, exist and help when they need external resources and efforts.

Combine all these leadership support methods, and we’re on our way to a successful implementation.

Leadership in Unit Testing Implementation, Part I

This series deals with the implementation of a unit testing process in a team or across multiple teams in an organization. Posts in the series include:
GoalsOutcomesLeading Indicators ILeading Indicators II
Leading Indicators IIILeadership ILeadership IIThe plan

Any process we start to roll out, requires management support. If we want it to succeed, anyway.

Inside teams, if the team leader opposes the new process, she will either actively, or secretly, work against it. If she’s for it, she’ll mandate it. When the team is independent enough, and can make their own decisions, the team leader will approve of the team making their own decisions, and facilitate the team’s success.

When we’re talking about multiple teams, and cross organization processes, it’s not even a question. Not only do we need to make sure the new processes take hold, sometimes we need to make more resources available if they are not there to begin with.

Think about an organization moving from one team writing tests to multiple teams. We need to support all of them on the IT level (enough resources and environments to run the tests), at the branch management level (who works on which branch and when the changes move to the trunk), at the automation level (optimizing build performance), and coordination level (what happens when the build goes red).

To make a (very) long story short, it takes management time, attention, and a lot of pushing (nicely) to allow the process to take effect.

Oh, and there’s one more thing leaders need: Patience.

Regardless of how simple the process is (and unit testing, is definitely not simple), patience is a prerequisite. Any process implementation takes time, and we usually see the fruit of our labor down the line. Add to that the learning curve is steep, and the recipe for impatience is complete.

The learning process in unit testing seem short – if you focus on learning the tools. But until people start writing regularly, effectively, and see the benefits it takes weeks, usually a lot more.

If there are constraints and conflicts, it will take even more than that. Consider a team working inside a legacy code swamp, and a closed architecture they aren’t allowed to change. Their ability to change the code is constrained, and therefore the ability to write tests is constrained. That means less tests written, and often less effective tests at that (you test what you can, regardless of importance).

Expecting coverage to rise under these conditions, and even more so, the number of effective tests to increase is bound to crash into the realities. With failed expectations, come disapointment, maybe an angered response, losing faith in the capabilities of the developers, and often calling it quits, way too early.

We’ll continue to discuss what else leaders can do in the next post.


TDD: Mind Your Language


One of the exercises I love to do in my TDD classes is to build a lightsaber in TDD. (Yes, of course that’s how they’re made).

In the exercise, I go through listing all kinds of features and use cases, and the first test we usually write is for turning the lightsaber on. Most times it looks like this:

It’s got a weird-ish name, and it’s probably not the first test I’d write (I’d do the off status after creation). But that’s not my peeve.

If you’re coming from a development background, like me, a getStatus method on a lightsaber seems perfectly ok. After all, how else would you check if the lightsaber is really on?

There are two issues with this. The smaller one is the premature generalization: Using an enum as a result. Sure, if you’re using a getStatus method, you want to return a meaningful value. Yet, if you TDD like a pro, the tests should tell you when to generalize, like after you have a couple of return values.

But I’ll let that one slide, there’s a bigger issue here.

Talk this way

Did you ever hear a Jedi master, or a Sith lord ask: “What status is that lightsaber? I need to get it”.

No. They don’t talk like that.
Regular humans don’t talk like that.

The only people who talk like that are programmers.

Chances are, if you’re a programmer and you’re still reading, you probably still don’t see the problem. Let me spell it out: We’re coding using terms that are different than the ones used in the business domain.

This results in maintaining two models, the business and the code (and sometimes the test model), all either using different terms, or not carrying the exact meaning. So we need translations. Translations bring mistakes with them (read: bugs).

As time goes by, and we add more code, the two models diverge. Making changes (say, a new requirement) needs more effort, and is more risky. We need to continuously update the business model, and re-translate into the code model. That is hard and error prone.

If we don’t put effort into it, the difference between the models grow. We breed complexity by using two languages, and we pay in a lot more effort.

What’s a better way? Well maybe something like this:

Now that’s Jedi talk. You can already feel the Force flowing through it.

Leading Indicators in Unit Testing Implementation, Part III

This series deals with the implementation of a unit testing process in a team or across multiple teams in an organization. Posts in the series include:
GoalsOutcomesLeading Indicators ILeading Indicators II
Leading Indicators IIILeadership ILeadership IIThe plan

in the last post we talked about the failing builds trend as an indicator of success of implementation.

The final metric we’ll look at, that can indicate how our process will go, is also related to broken builds. It is the “time until broken builds are fixed” or TUBBF. Ok, I made up this acronym. But you can use it anyway.

If we want to make sure the process is implemented effectively, knowing that builds are broken is not enough. When builds break, they need to be fixed.

No surprise there.

Remember that the long-term goal is to have working code, and for that we need people to be attentive, responsive and fixing broken builds quickly. Tracking the TUBBF, can help us achieve that goal. We can infer how people understand the importance of working code, by looking at how they treat broken builds.

Sharing is caring

One of eXtreme Programming’s principles is shared code ownership, where no single person was the caretaker of a single piece of code. When our process succeeds, we want to see the corollary – everyone is responsible to every piece of code.

With small teams it’s easier to achieve. Alas, with scale it becomes harder. Usually teams specialize in bits of code, and conjure the old demon of ownership. With ownership comes blame and the traditional passing of the buck.

After all, our CI log says it right there: They broke the build, by committing their code that doesn’t have any resemblence or relation to our code. We can’t and won’t fix it. They broke it. They should fix it.

Then comes the next logical conclusion: If we didn’t break the code, we can continue אם safely commit the code. After all, we know our code works, we wrote it.

And so, every team blames the other team, committing unchecked changes and the build remains red.

(by the way, maybe they commited the last bit that broke the build, but that doesn’t mean their changes were at fault. If a build takes a long time, usually changes are collected until it starts, and it only flags the last commit, although that last one may be innocent).

Everybody’s Mr. Fix-It

One of the drastic measures we can do, is to lock the SCM system when the build breaks. That’ll teach them collective ownership.

But that doesn’t always work. People just continue to work on local copies, believing that somebody else is working relentlessly, even as we speak, on fixing the build.

Another option is to put the training wheels on. Train a team about keeping build green without interference from other teams, by developing on team-owned branches. We track the team’s behavior on their branch, encouraging them to fix the build. They are responsible to keep the build on their own branch green. Only when branch builds are stable and green, it’s ok to merge them to trunk.

The worst option, and I’ve seen it many times, is having someone else be the bad cop.

Imagine an IT/DevOps/CI master that starts each day checking all the bad news from the night, tracking the culprit, and making them, but mostly begging them, to make amends. Apart from not making the team responsible for their code, it doesn’t stop others from committing, because of the malfunctioning process.

As long as we can track the TUBBF is some manner, we can redirect the programmers’ behavior toward a stable build, and teach the responsibility of keeping it green. As we do this, we focus on the importance of shared responsibility and collect a bonus for working, sometimes even shippable, code.

Leading Indicators in Unit Testing Implementation, Part II

This series deals with the implementation of a unit testing process in a team or across multiple teams in an organization. Posts in the series include:
GoalsOutcomesLeading Indicators ILeading Indicators II
Leading Indicators IIILeadership ILeadership IIThe plan

Part I was HUGE! Now, let’s look at broken builds. We want to see a decrease in their number over time.

This may sound a bit strange. Our CI is supposed to tell us if the build breaks, that’s its job. Isn’t having more of them a good thing?

Unsurprisingly, the answer is “it depends”.

As we want the earliest feedback, the answer is “yes, of course”, the CI system serves us well. However, if we don’t see a decrease in broken builds, that may mean the CI process is not working effectively. We should investigate.


Let’s trace the steps leading to failing builds and see if we can improve our process.

Are all the tests passing locally? If not, we’re integrating code that fails tests into the trunk. If tests are not run locally, when they run in CI builds, they will probably fail too. That’s a big no-no. We may even find out the tests are not even run locally, and we’d want to improve on these behaviors.

If tests do run and pass locally before they are committed, there might be another problem. That may point to issues of isolation. If they pass locally, tests that depends on available resources in the local environments, find them there. But at the CI stage, they don’t and fail. More broken builds, indicate the team has not learned how to write isolated tests yet.

There might even be a bigger issue lurking.

Trust and accelerate feedback

We want to trust them on the CI environment, but since they “work on our machine” and not on the CI, these tests just got a  trust downgrade. This can have a weird counter effect on our way of running them.

Since results we trust run on the CI, and local runs are creating confusing results, we may stop running tests locally at all, and run them instead on the CI server making sure they run correctly there. When we do that, we make our feedback cycle longer, but more importantly, we risk the tests failing for the right reason, but holding the rest of the team hostage until they are fixed.

To get the right feedback early, we need to get back to running tests locally.

We want to increase the number of isolated tests, so they can be run locally, and can be trusted to fail on the CI server. Isolated unit or integration tests failing before committing is the first line of defense.

Then, we want to be able to run the non-isolated tests either locally or in a clean environment as we can manage. The point is to not commit code until we trust it. This may require changing available environments, modifying the tests to ensure cleanliness, pre-commit integration or any combination of those.

Can you believe all these improvement opportunities come from a single indicator? The deeper we dig, and more questions we ask, we can find opportunities for improving the process as a whole.

We’re not done yet.