All posts by smspillaz

About smspillaz

Software Engineer and Startup Founder

Let your tests tell you when your code design sucks.

Google’s testing on the toilet blog is a really great read for people who are looking at getting into test-driven development and unit testing as a development methodology. Their most recent post called “Don’t overuse mocks” makes an important point – overmocking in your tests makes your tests less readable.

For the uninitiated, “mocking” in the unit-testing world is basically where you replace an interface with a “mock”, whose behaviour can be specified at runtime. If the code that you’re trying to get under test is dependent on some behaviour from an interface provided to it, then the mocking framework effectively allows you to specify that behaviour so that you can make your code do certain things.

For example:

class Animal
{
    public:
        virtual bool feet () const = 0;
};

class Seal :
    public Animal
{
    public:
        Seal ()
        {
             makeLotsOfNoise ();
        }

    private:
        virtual bool feet () const { return false; }
};

bool canJump (Animal const &animal)
{
    return animal.feet ();
}

We want to test that canJump will return false if the animal has no feet and will return true if animal has feet. We don’t want to depend on Seal, because its constructor makes lots of noise. Also, Seal might get feet in the future and that would make our test invalid. So we use a mock:

class MockAnimal :
    public Animal
{
    public:

        MOCK_CONST_METHOD0 (feet, bool ());
}

TEST (Main, JumpIfFeet)
{
    MockAnimal animal;
    ON_CALL (animal, feet ()).WillByDefault (Return (true));
    EXPECT_TRUE (canJump (animal));
}

TEST (Main, NoJumpIfNotFeet)
{
    MockAnimal animal;
    ON_CALL (animal, feet ()).WillByDefault (Return (false));
    EXPECT_FALSE (canJump (animal));
}

The example in “Don’t overuse mocks” is a bit more involved. It involves the use of mocking to control the interaction between a CreditCardServer, TransactionManager, Transaction, Payment and a PaymentProcessor. Here’s the example of how its’ done:

public void testCreditCardIsCharged() {
  paymentProcessor = new PaymentProcessor(mockServer);
  when(mockServer.isServerAvailable()).thenReturn(true);
  when(mockServer.beginTransaction().thenReturn(mockManager);
  when(mockManager.getTransaction().thenReturn(transaction);
  when(mockServer.pay(transaction, creditCard,
                      500).thenReturn(mockPayment);
  when(mockPayment.isOverMaxBalance()).thenReturn(false);
  paymentProcessor.processPayment(creditCard, Money.dollars(500));
  verify(mockServer).pay(transaction, creditCard, 500);
}

[some of the variable names have been changed for the sake of line-wrapping]

Its quite clear that the usage of mocks in this case is used to describe a multi-step process:

  1. The server must be available (say it is)
  2. We need a TransactionManager from the server (return our mock one)
  3. We need a Transaction from the TransactionManager (return our mock one)
  4. We need a payment from the TransactionManager (return our mock one)
  5. We need to know if the payment is over balance (say it isn’t)

If all of those conditions are satisfied, then the payment should be processed on the server. The fact that we had to wriggle so much to get the paymentProcessor to call pay () results in a somewhat fragile test. If we added a new condition unrelated to what we are testing here then we also need to change the behaviour of the mocks otherwise that test will fail. This is the peril of “over-mocking” – you’ve removed a dependency on some behaviour, but you need to re-implement the rest yourself.

Testing on the Toilet’s solution to this problem is “don’t use mocking for stuff like this”.

public void testCreditCardIsCharged() {
  paymentProcessor = new PaymentProcessor(creditCardServer);
  paymentProcessor.processPayment(creditCard, Money.dollars(500));
  assertEquals(500, creditCardServer.getMostRecentCharge(creditCard));
}

And to some extent, they’d be right. But I think its treating the symptom (the test smell) as opposed to the problem (the code smell).

There are three real problems here:

  1. paymentProcessor.processPayment does too much. It needs to ask if the server is running, ask for a transaction manager, ask for a transaction, ask for a payment and ask if the payment is over balance.
  2. paymentProcessor.processPayment is also asking too much. It should be told what do rather than just being provided some interfaces and figuring out what to do.
  3. testCreditCardIsCharged is too vague – what conditions exactly are we testing that the credit card is charged? Are we just testing that it is charged? Are we testing that it is charged when the payment server is available? Are we testing that it is charged when the payment is over balance? In reality, they’re testing that a series of steps results in success rather that testing that the system behaves as expected under particular conditions.

How can we fix this?

Well, first of all, we should figure out what processPayment is doing and whether or not it can be split up. We don’t have the code, but judging by the mock’s behaviour – it does the following:

  1. Checks if the server is running
  2. Gets a payment object for the amount requested
  3. Checks if the payment is over-balance
  4. Calls pay () on the server

Steps 1 & 2 and 3 & 4 can really be split up into two separate functions. We’re going to make one method get our payment () object from the server and another method responsible paying if appropriate. So we have:

  1. Preconditions for being able to pay at all (server running, get objects)
  2. Preconditions for being able to pay the amount requested (not over balance)
  3. Postcondition (payment succeeded)

So lets do that:

public class PaymentProcessor
{
    public Transaction prepareTransaction (CreditCardServer server)
    {
        if (!server.isServerAvailable ())
            return null;

        return server.beginTransaction ().getTransaction ();
    }

    public Payment preparePayment (Transaction transaction,
                                   Money money)
    {
        return new Payment (transaction, money);
    }

    public void processPayment (CreditCardServer server,
                                Transaction transaction,
                                Payment payment)
    {
        if (!payment.isOverMaxBalance ())
            server.pay (transaction, payment);
    }

Now, we have three distinct methods which each do different things. Now we want to test them.

The first thing we want to test is that a payment which is not over-balance is paid:

public void testPaymentPaidIfNotOverBalance ()
{
    Payment payment = new Mock <Payment> ();
    when (payment, isOverMaxBalance ()).thenReturn (false);

    paymentProcessor.processPayment (mockServer,
                                     stubTransaction,
                                     payment);

    verify (mockServer, pay (stubTransaction, payment);
}

One assert done, two to go. Lets check that a non-active server does not give us a transaction to play with:

public void testNoTransactionIfServerNotOnline ()
{
    Mock <CreditCardServer> server = new Mock <CreditCardServer> ();
    assertThat (paymentProcessor.prepareTransaction (server),
                IsNull ());
}

And now make sure that we get a payment object from our transaction

public void testPaymentObtainedFromTransaction ()
{
    StubTransaction transaction;
    assertThat (paymentProcessor.preparePayment (transaction,
                                                 Money.dollars (500),
                NotNull ());
}

Our client code is going to be a little bit more involved now:

Transaction transaction = processor.prepareTransaction (server);
Payment payment = processor.preparePayment (transaction,
                                            Money.dollars (500));
processor.processPayment (server, transaction, payment);

On the other hand we now know what the process is going to be within the client, and testing PaymentProcess is not so involved anymore, because each of its parts only perform a very specific task.

Also – in the client we don’t ask for anything. We just tell the processor to prepare a transaction, prepare a payment and pay. And in the tests, we don’t have to say “yes the transaction manager is running, yes, you can have a transaction” in order to make the part of the code responsible for paying if not over balance to work correctly. It just works correctly as long as you pass it some data it expects.

The moral of the story is: If you end up with fragile tests, go and have a look at your code. The tests will tell you if your interface sucks.

Regression Testing for crashes on Google Test

One common use for automated testing is to put tests in place that reproduce bugs to ensure that the same kind of bug can’t happen again once the failing test is fixed.

There is a small caveat when doing this with bugs that result in segfaults however – which is that a test that fails-by-crashing will cause the entire test binary to halt as well, meaning that you won’t see the results of any tests afterwards. Its also difficult to see what it was in the test itself that crashed at a glance without using a debugger of some sort.

Thankfully, google-test provides a method to test for things-that-shouldn’t-crash – by forking and checking the exit code of the forked process.

So my new best practice for writing these sorts of tests is now:

using ::testing::ExitedWithCode;

TEST(FooDeathTest, FooPtrFromBarIsSafe)
{
    std::unique_ptr <Foo> foo (from_bar ());

    EXPECT_EXIT({
        foo->baz = 1;
        std::cerr << "Success!" << std::endl;
        exit (0);
    }, ExitedWithCode (0), "Success");
}

Instead of:

TEST(Foo, FooPtrFromBarIsSafe)
{
    std::unique_ptr <Foo> foo (from_bar ());
    foo->baz = 1; // won't crash if foo is valid
}

There are three benefits to doing it this way:

Whoever is reading the test source can instantly see that the thing you’re testing is that some operation succeeds and exits cleanly

This is especially the case when EXPECT_EXIT ({ }, ExitedWithCode (0), “”) becomes somewhat of an idiom.

You can test the inverse – that is – you can test that something will crash with the same style:

using ::testing::ExitedWithCode;

TEST(FooDeathTest, FooPtrFromBarIsNotSafe)
{
    std::unique_ptr <Foo> foo (from_bar ());

    EXPECT_DEATH({
        foo->baz = 1;
        exit (0);
    }, "");
}

You get a much more informative error message at a first glance:

[RUN    ] FooDeathTest.FooPtrFromBarIsSafe
Failed expectation (lineno):
    Expected that the block:
    {
        foo->baz = 1;
        std::cerr << "Success!" << std::endl;
        exit (0);
    }
    Would exit with the message "Success" and an exit code of 0;
Actual:
    Exited with no message and an exit code of 9
[    FAIL] FooDeathTest.FooPtrFromBarIsSafe
(other tests run here)

As opposed to:

[RUN    ] Foo.FooPtrFromBarIsSafe
Segmentation Fault
(no other tests run here)

Impostor

I’ve been getting a bit of impostor syndrome lately, something which I feel is holding me back a fair bit. Not sure what it is, but it seems like every time I try to move on a moment of inspiration, I start to second-guess. Is what I’m doing a useful endeavor? Maybe someone else has already done this? Am I good enough to write this yet?

Of course, its terribly annoying – I’ve been left with a bunch of half-finished projects for a broader initiative, which seems to create a nasty feedback loop confirming the sense of unworthiness.

Perhaps this is the wrong place to ask this – but does anybody have a good idea on how to make these thoughts go away? I’d much rather be presenting cool things in this space than saying that I’ve talked myself out of yet another idea.

Moving forward

In the past few months I’ve been involved in a bit of a public disagreement about my relationship with the Ubuntu project, namely in terms of some work I was doing on Compiz and Unity back in December to February.

While I believed that my statements represented the truth at the time, in hindsight, I believe they were more or less a product of my frustration. I wanted to publicly apologize for that, and share a clarified timeline and sequence of events now that I’ve had a chat with a few of the people involved so set the record straight, and I also want to share about the plan that we’ve come up with to move forward.

I blogged in December that I had come up with a set of changes for both Compiz and Unity to take advantage of the new GLX_EXT_buffer_age extension. Importantly, these changes were substantial, required change across three key areas of the stack and required testing on several different hardware and driver combinations.

I had posted a set of changes up for review around that time, but the review process didn’t happen for a number of largely unrelated reasons. First of all, it was around the time of the December holidays, so people were just starting to take breaks. Second of all, there were a few staff movements within Canonical which meant that the number of reviewers available for that work was lower than it normally would have been.

In late February the decision was taken by the team that manages the Ubuntu distribution to branch Compiz for raring and only accept minor changes. I was consulted on that decision and agreed to it. I’ve been told that the rest of the team at Canonical working on Unity was also informed of this. I was unhappy at the time that no review of my changes occurred given the two month period, but I also accepted that we were close to feature freeze and it wouldn’t make sense to have any large changes in place then. At this point, I decided to stop working on those changes, instead with a view to pick them up again for 13.10.

Around the same time, a discussion about a rolling release took place on the public mailing list and at vUDS.

I posted in various comments that I was disappointed with the lack of review over the past two months, attributing it to the confusion caused by the rolling release and the apparently evident preference for a closed development model. These comments were mistaken. The real cause appears simply to be that there was a staff shortage, and not enough was done by me to draw the staff’s attention to these proposals.

The rolling release proposal said that this model should be adopted “now”, and many people on the list agreed. Adopting the “rolling release” for 13.04 would mean two things:

  1. 13.04 would become the “rolling release” and would not be “released” at its usual release point.
  2. With no release of 13.04, there would also be no feature freeze.

Whilst we were operating under that assumption and whilst that discussion was ongoing, I was told that my changes were likely to be merged in at some point in the near future. As such, the review process started again in spite of the previous agreement to keep large changes out of raring. This was because both the reviewers and myself believed that these changes were now appropriate for the release. My main error here was failing to contact the distribution team to check if there had been a misunderstanding.

In the meantime, we faced a number of high and critical regressions stemming from a small fix for java and a workaround for a hang in virtualbox. This didn’t do a whole lot for confidence for anybody. It just so happened that the regressions we kept on having were in parts of the code that were difficult to get any kind of automated testing around.

The review process ended again a week or two afterwards, citing that the release team was unhappy with having those changes for raring, because they would present a risk by breaking feature freeze for the 13.04 release, which was now happening. I was disheartened that this was the case and stated that on the merge review.

To be honest, I think my behavior here wasn’t very constructive. The better approach would have been to engage properly with the distribution team to devise proper test plans and QA strategies, and to clarify with everyone what the true status of all pending work was. That being said, things finally came to a head today. The distribution team and I have had a good long talk about everything, and we’ve come up with a plan to move forward and past this mess. The stupid conflict and a war of words by myself was entirely unnecessary in order to come to this resolution.

I want to make it clear that these difficulties were the fault of nobody. Nobody was negligent, nobody was inconsiderate (except perhaps myself) and nobody was acting in bad faith. These difficulties were the product of circumstance, which didn’t apparently manifest themselves in that way when I was looking at them.

It is anticipated that the relevant changes are going to be slated for review and testing for 13.10 at the beginning of the cycle. The test plan is going to include the usual test suite + ensuring that we don’t have any autopilot failures, as well as some extensive user testing to try and uncover any subtle bugs that may have been missed during the time that this work was QA’d.

Right now, I encourage everyone to test the PPA that Trevino has put up which contain the most up to date version of these changes sync’d with trunk for both nux, unity and compiz. Test it on both nvidia and non-nvidia hardware, since the changes affect both. File (tagged) bugs and I’ll get on to them when I can.

Lets make 13.10 rock!

Integrating vera++ with CMake

I’ve been looking into making automated code style checks part of a broader QA plan for a project I’ve been working on while its still relatively young. Astyle seems to be good at handling lots of different languages but there isn’t any mode for it to error out when it sees something in the code that it thinks is wrong. I like the idea of automated formatting, but sometimes for complicated expressions and macros, automated formatters can get things wrong. I wanted something that required manual intervention, but could still be run on an automated basis. And specifically designed for C++.

I noticed that Mir was using vera++ in order to do something like this, but there wasn’t any kind of automated build rule for running these checks on a continuous basis. I wanted to ensure that we could run checks on the files that were actually built so that style failures were errors actionable when the target was built. I also wanted to ensure that we didn’t do style checks on things like autogenerated or imported code that would be difficult to exclude using a find rule.

Finally, I realized that having these checks integrated with the buildsystem could be useful for lots of other projects too, so I wanted to make it something separate from what I was working on already.

As such, I’ve just posted the first bits of veracpp-cmake on my GitHub profile. veracpp-cmake is basically just two CMake macros to integrate vera++ into your build. First of all, it provides a FindVeraPP macro so that you can use it with find_package. It also provides some useful commands.

verapp_import_default_rules_into_subdirectory_on_target
verapp_import_default_transformations_into_subdirectory_on_target
verapp_import_default_profiles_into_subdirectory_on_target

Will import the default rules, transformations and profiles installed with vera++ into a project subdirectory (usually in the build-dir) just before some target is executed. This means that you don’t have to fork the default vera++ rules and can instead import them dynamically from the user’s installation. All these three commands are just wrappers around

verapp_copy_files_in_dir_to_subdir_on_target

Which provides a rule to copy files from on subdirectory to another just before a target is executed. Using this rule, you can copy your own rules, profiles or transformations into the same subdirectory as the default ones at build-time and then use both depending on which profile is set. Vera++ requires that all the rules, etc be in the same subdirectory of the same directory tree (eg, scripts / rules/ | transformations/, profiles /).

verapp_profile_check_source_files_conformance

This function does a lot of the grunt-work in terms of checking source code file compliance. You just provide it with a target, the path to vera++ scripts, a profile and whether or not style check failures should be fatal or nonfatal errors. Then, just before your target is linked, it will have its source files scanned for style errors. That way, this integrates nicely with IDEs which parse the output of make.

Hopefully this will be useful to other projects. You can import it as a git submodule and then adjust your CMAKE_MODULE_PATH and use it right away. The only caveat is that you need to both add it as a subdirectory with add_subdirectory () and also include it with include () because we need to build a small tool to set the exit status correctly when vera++ exits. I’ve filed a bug about this and hopefully I can get rid of this requirement soon.