XBMC on Wayland Compositors, take two

In late February this year, I published a proof of concept demonstrating the XBMC Media Center on the Weston system compositor. It was basically a hack which used SDL’s existing wayland compositor support with a few additions required to make XBMC work. XBMC plans to drop SDL usage and use window systems directly, which makes a lot of sense, but it meant that this proof of concept would have to be largely rewritten.

As such, I’ve decided to do so as a Google Summer of Code project (suprise!), as a kind of “take-two”, except that it uses the Wayland client interface directly, (shouldn’t) crash, (should) have some more comprehensive test coverage and perhaps could be expanded to support other display systems like Mir and direct display on the framebuffer via kms/gbm.

Image

I’ve just published a new pull request for an initial version of this work. Hopefully users will be able to use this (and perhaps much more at the end of the GSoC cycle).

Let your tests tell you when your code design sucks.

Google’s testing on the toilet blog is a really great read for people who are looking at getting into test-driven development and unit testing as a development methodology. Their most recent post called “Don’t overuse mocks” makes an important point – overmocking in your tests makes your tests less readable.

For the uninitiated, “mocking” in the unit-testing world is basically where you replace an interface with a “mock”, whose behaviour can be specified at runtime. If the code that you’re trying to get under test is dependent on some behaviour from an interface provided to it, then the mocking framework effectively allows you to specify that behaviour so that you can make your code do certain things.

For example:

class Animal
{
    public:
        virtual bool feet () const = 0;
};

class Seal :
    public Animal
{
    public:
        Seal ()
        {
             makeLotsOfNoise ();
        }

    private:
        virtual bool feet () const { return false; }
};

bool canJump (Animal const &animal)
{
    return animal.feet ();
}

We want to test that canJump will return false if the animal has no feet and will return true if animal has feet. We don’t want to depend on Seal, because its constructor makes lots of noise. Also, Seal might get feet in the future and that would make our test invalid. So we use a mock:

class MockAnimal :
    public Animal
{
    public:

        MOCK_CONST_METHOD0 (feet, bool ());
}

TEST (Main, JumpIfFeet)
{
    MockAnimal animal;
    ON_CALL (animal, feet ()).WillByDefault (Return (true));
    EXPECT_TRUE (canJump (animal));
}

TEST (Main, NoJumpIfNotFeet)
{
    MockAnimal animal;
    ON_CALL (animal, feet ()).WillByDefault (Return (false));
    EXPECT_FALSE (canJump (animal));
}

The example in “Don’t overuse mocks” is a bit more involved. It involves the use of mocking to control the interaction between a CreditCardServer, TransactionManager, Transaction, Payment and a PaymentProcessor. Here’s the example of how its’ done:

public void testCreditCardIsCharged() {
  paymentProcessor = new PaymentProcessor(mockServer);
  when(mockServer.isServerAvailable()).thenReturn(true);
  when(mockServer.beginTransaction().thenReturn(mockManager);
  when(mockManager.getTransaction().thenReturn(transaction);
  when(mockServer.pay(transaction, creditCard,
                      500).thenReturn(mockPayment);
  when(mockPayment.isOverMaxBalance()).thenReturn(false);
  paymentProcessor.processPayment(creditCard, Money.dollars(500));
  verify(mockServer).pay(transaction, creditCard, 500);
}

[some of the variable names have been changed for the sake of line-wrapping]

Its quite clear that the usage of mocks in this case is used to describe a multi-step process:

  1. The server must be available (say it is)
  2. We need a TransactionManager from the server (return our mock one)
  3. We need a Transaction from the TransactionManager (return our mock one)
  4. We need a payment from the TransactionManager (return our mock one)
  5. We need to know if the payment is over balance (say it isn’t)

If all of those conditions are satisfied, then the payment should be processed on the server. The fact that we had to wriggle so much to get the paymentProcessor to call pay () results in a somewhat fragile test. If we added a new condition unrelated to what we are testing here then we also need to change the behaviour of the mocks otherwise that test will fail. This is the peril of “over-mocking” – you’ve removed a dependency on some behaviour, but you need to re-implement the rest yourself.

Testing on the Toilet’s solution to this problem is “don’t use mocking for stuff like this”.

public void testCreditCardIsCharged() {
  paymentProcessor = new PaymentProcessor(creditCardServer);
  paymentProcessor.processPayment(creditCard, Money.dollars(500));
  assertEquals(500, creditCardServer.getMostRecentCharge(creditCard));
}

And to some extent, they’d be right. But I think its treating the symptom (the test smell) as opposed to the problem (the code smell).

There are three real problems here:

  1. paymentProcessor.processPayment does too much. It needs to ask if the server is running, ask for a transaction manager, ask for a transaction, ask for a payment and ask if the payment is over balance.
  2. paymentProcessor.processPayment is also asking too much. It should be told what do rather than just being provided some interfaces and figuring out what to do.
  3. testCreditCardIsCharged is too vague – what conditions exactly are we testing that the credit card is charged? Are we just testing that it is charged? Are we testing that it is charged when the payment server is available? Are we testing that it is charged when the payment is over balance? In reality, they’re testing that a series of steps results in success rather that testing that the system behaves as expected under particular conditions.

How can we fix this?

Well, first of all, we should figure out what processPayment is doing and whether or not it can be split up. We don’t have the code, but judging by the mock’s behaviour – it does the following:

  1. Checks if the server is running
  2. Gets a payment object for the amount requested
  3. Checks if the payment is over-balance
  4. Calls pay () on the server

Steps 1 & 2 and 3 & 4 can really be split up into two separate functions. We’re going to make one method get our payment () object from the server and another method responsible paying if appropriate. So we have:

  1. Preconditions for being able to pay at all (server running, get objects)
  2. Preconditions for being able to pay the amount requested (not over balance)
  3. Postcondition (payment succeeded)

So lets do that:

public class PaymentProcessor
{
    public Transaction prepareTransaction (CreditCardServer server)
    {
        if (!server.isServerAvailable ())
            return null;

        return server.beginTransaction ().getTransaction ();
    }

    public Payment preparePayment (Transaction transaction,
                                   Money money)
    {
        return new Payment (transaction, money);
    }

    public void processPayment (CreditCardServer server,
                                Transaction transaction,
                                Payment payment)
    {
        if (!payment.isOverMaxBalance ())
            server.pay (transaction, payment);
    }

Now, we have three distinct methods which each do different things. Now we want to test them.

The first thing we want to test is that a payment which is not over-balance is paid:

public void testPaymentPaidIfNotOverBalance ()
{
    Payment payment = new Mock <Payment> ();
    when (payment, isOverMaxBalance ()).thenReturn (false);

    paymentProcessor.processPayment (mockServer,
                                     stubTransaction,
                                     payment);

    verify (mockServer, pay (stubTransaction, payment);
}

One assert done, two to go. Lets check that a non-active server does not give us a transaction to play with:

public void testNoTransactionIfServerNotOnline ()
{
    Mock <CreditCardServer> server = new Mock <CreditCardServer> ();
    assertThat (paymentProcessor.prepareTransaction (server),
                IsNull ());
}

And now make sure that we get a payment object from our transaction

public void testPaymentObtainedFromTransaction ()
{
    StubTransaction transaction;
    assertThat (paymentProcessor.preparePayment (transaction,
                                                 Money.dollars (500),
                NotNull ());
}

Our client code is going to be a little bit more involved now:

Transaction transaction = processor.prepareTransaction (server);
Payment payment = processor.preparePayment (transaction,
                                            Money.dollars (500));
processor.processPayment (server, transaction, payment);

On the other hand we now know what the process is going to be within the client, and testing PaymentProcess is not so involved anymore, because each of its parts only perform a very specific task.

Also – in the client we don’t ask for anything. We just tell the processor to prepare a transaction, prepare a payment and pay. And in the tests, we don’t have to say “yes the transaction manager is running, yes, you can have a transaction” in order to make the part of the code responsible for paying if not over balance to work correctly. It just works correctly as long as you pass it some data it expects.

The moral of the story is: If you end up with fragile tests, go and have a look at your code. The tests will tell you if your interface sucks.

Regression Testing for crashes on Google Test

One common use for automated testing is to put tests in place that reproduce bugs to ensure that the same kind of bug can’t happen again once the failing test is fixed.

There is a small caveat when doing this with bugs that result in segfaults however – which is that a test that fails-by-crashing will cause the entire test binary to halt as well, meaning that you won’t see the results of any tests afterwards. Its also difficult to see what it was in the test itself that crashed at a glance without using a debugger of some sort.

Thankfully, google-test provides a method to test for things-that-shouldn’t-crash – by forking and checking the exit code of the forked process.

So my new best practice for writing these sorts of tests is now:

using ::testing::ExitedWithCode;

TEST(FooDeathTest, FooPtrFromBarIsSafe)
{
    std::unique_ptr <Foo> foo (from_bar ());

    EXPECT_EXIT({
        foo->baz = 1;
        std::cerr << "Success!" << std::endl;
        exit (0);
    }, ExitedWithCode (0), "Success");
}

Instead of:

TEST(Foo, FooPtrFromBarIsSafe)
{
    std::unique_ptr <Foo> foo (from_bar ());
    foo->baz = 1; // won't crash if foo is valid
}

There are three benefits to doing it this way:

Whoever is reading the test source can instantly see that the thing you’re testing is that some operation succeeds and exits cleanly

This is especially the case when EXPECT_EXIT ({ }, ExitedWithCode (0), “”) becomes somewhat of an idiom.

You can test the inverse – that is – you can test that something will crash with the same style:

using ::testing::ExitedWithCode;

TEST(FooDeathTest, FooPtrFromBarIsNotSafe)
{
    std::unique_ptr <Foo> foo (from_bar ());

    EXPECT_DEATH({
        foo->baz = 1;
        exit (0);
    }, "");
}

You get a much more informative error message at a first glance:

[RUN    ] FooDeathTest.FooPtrFromBarIsSafe
Failed expectation (lineno):
    Expected that the block:
    {
        foo->baz = 1;
        std::cerr << "Success!" << std::endl;
        exit (0);
    }
    Would exit with the message "Success" and an exit code of 0;
Actual:
    Exited with no message and an exit code of 9
[    FAIL] FooDeathTest.FooPtrFromBarIsSafe
(other tests run here)

As opposed to:

[RUN    ] Foo.FooPtrFromBarIsSafe
Segmentation Fault
(no other tests run here)

Impostor

I’ve been getting a bit of impostor syndrome lately, something which I feel is holding me back a fair bit. Not sure what it is, but it seems like every time I try to move on a moment of inspiration, I start to second-guess. Is what I’m doing a useful endeavor? Maybe someone else has already done this? Am I good enough to write this yet?

Of course, its terribly annoying – I’ve been left with a bunch of half-finished projects for a broader initiative, which seems to create a nasty feedback loop confirming the sense of unworthiness.

Perhaps this is the wrong place to ask this – but does anybody have a good idea on how to make these thoughts go away? I’d much rather be presenting cool things in this space than saying that I’ve talked myself out of yet another idea.

Moving forward

In the past few months I’ve been involved in a bit of a public disagreement about my relationship with the Ubuntu project, namely in terms of some work I was doing on Compiz and Unity back in December to February.

While I believed that my statements represented the truth at the time, in hindsight, I believe they were more or less a product of my frustration. I wanted to publicly apologize for that, and share a clarified timeline and sequence of events now that I’ve had a chat with a few of the people involved so set the record straight, and I also want to share about the plan that we’ve come up with to move forward.

I blogged in December that I had come up with a set of changes for both Compiz and Unity to take advantage of the new GLX_EXT_buffer_age extension. Importantly, these changes were substantial, required change across three key areas of the stack and required testing on several different hardware and driver combinations.

I had posted a set of changes up for review around that time, but the review process didn’t happen for a number of largely unrelated reasons. First of all, it was around the time of the December holidays, so people were just starting to take breaks. Second of all, there were a few staff movements within Canonical which meant that the number of reviewers available for that work was lower than it normally would have been.

In late February the decision was taken by the team that manages the Ubuntu distribution to branch Compiz for raring and only accept minor changes. I was consulted on that decision and agreed to it. I’ve been told that the rest of the team at Canonical working on Unity was also informed of this. I was unhappy at the time that no review of my changes occurred given the two month period, but I also accepted that we were close to feature freeze and it wouldn’t make sense to have any large changes in place then. At this point, I decided to stop working on those changes, instead with a view to pick them up again for 13.10.

Around the same time, a discussion about a rolling release took place on the public mailing list and at vUDS.

I posted in various comments that I was disappointed with the lack of review over the past two months, attributing it to the confusion caused by the rolling release and the apparently evident preference for a closed development model. These comments were mistaken. The real cause appears simply to be that there was a staff shortage, and not enough was done by me to draw the staff’s attention to these proposals.

The rolling release proposal said that this model should be adopted “now”, and many people on the list agreed. Adopting the “rolling release” for 13.04 would mean two things:

  1. 13.04 would become the “rolling release” and would not be “released” at its usual release point.
  2. With no release of 13.04, there would also be no feature freeze.

Whilst we were operating under that assumption and whilst that discussion was ongoing, I was told that my changes were likely to be merged in at some point in the near future. As such, the review process started again in spite of the previous agreement to keep large changes out of raring. This was because both the reviewers and myself believed that these changes were now appropriate for the release. My main error here was failing to contact the distribution team to check if there had been a misunderstanding.

In the meantime, we faced a number of high and critical regressions stemming from a small fix for java and a workaround for a hang in virtualbox. This didn’t do a whole lot for confidence for anybody. It just so happened that the regressions we kept on having were in parts of the code that were difficult to get any kind of automated testing around.

The review process ended again a week or two afterwards, citing that the release team was unhappy with having those changes for raring, because they would present a risk by breaking feature freeze for the 13.04 release, which was now happening. I was disheartened that this was the case and stated that on the merge review.

To be honest, I think my behavior here wasn’t very constructive. The better approach would have been to engage properly with the distribution team to devise proper test plans and QA strategies, and to clarify with everyone what the true status of all pending work was. That being said, things finally came to a head today. The distribution team and I have had a good long talk about everything, and we’ve come up with a plan to move forward and past this mess. The stupid conflict and a war of words by myself was entirely unnecessary in order to come to this resolution.

I want to make it clear that these difficulties were the fault of nobody. Nobody was negligent, nobody was inconsiderate (except perhaps myself) and nobody was acting in bad faith. These difficulties were the product of circumstance, which didn’t apparently manifest themselves in that way when I was looking at them.

It is anticipated that the relevant changes are going to be slated for review and testing for 13.10 at the beginning of the cycle. The test plan is going to include the usual test suite + ensuring that we don’t have any autopilot failures, as well as some extensive user testing to try and uncover any subtle bugs that may have been missed during the time that this work was QA’d.

Right now, I encourage everyone to test the PPA that Trevino has put up which contain the most up to date version of these changes sync’d with trunk for both nux, unity and compiz. Test it on both nvidia and non-nvidia hardware, since the changes affect both. File (tagged) bugs and I’ll get on to them when I can.

Lets make 13.10 rock!

Integrating vera++ with CMake

I’ve been looking into making automated code style checks part of a broader QA plan for a project I’ve been working on while its still relatively young. Astyle seems to be good at handling lots of different languages but there isn’t any mode for it to error out when it sees something in the code that it thinks is wrong. I like the idea of automated formatting, but sometimes for complicated expressions and macros, automated formatters can get things wrong. I wanted something that required manual intervention, but could still be run on an automated basis. And specifically designed for C++.

I noticed that Mir was using vera++ in order to do something like this, but there wasn’t any kind of automated build rule for running these checks on a continuous basis. I wanted to ensure that we could run checks on the files that were actually built so that style failures were errors actionable when the target was built. I also wanted to ensure that we didn’t do style checks on things like autogenerated or imported code that would be difficult to exclude using a find rule.

Finally, I realized that having these checks integrated with the buildsystem could be useful for lots of other projects too, so I wanted to make it something separate from what I was working on already.

As such, I’ve just posted the first bits of veracpp-cmake on my GitHub profile. veracpp-cmake is basically just two CMake macros to integrate vera++ into your build. First of all, it provides a FindVeraPP macro so that you can use it with find_package. It also provides some useful commands.

verapp_import_default_rules_into_subdirectory_on_target
verapp_import_default_transformations_into_subdirectory_on_target
verapp_import_default_profiles_into_subdirectory_on_target

Will import the default rules, transformations and profiles installed with vera++ into a project subdirectory (usually in the build-dir) just before some target is executed. This means that you don’t have to fork the default vera++ rules and can instead import them dynamically from the user’s installation. All these three commands are just wrappers around

verapp_copy_files_in_dir_to_subdir_on_target

Which provides a rule to copy files from on subdirectory to another just before a target is executed. Using this rule, you can copy your own rules, profiles or transformations into the same subdirectory as the default ones at build-time and then use both depending on which profile is set. Vera++ requires that all the rules, etc be in the same subdirectory of the same directory tree (eg, scripts / rules/ | transformations/, profiles /).

verapp_profile_check_source_files_conformance

This function does a lot of the grunt-work in terms of checking source code file compliance. You just provide it with a target, the path to vera++ scripts, a profile and whether or not style check failures should be fatal or nonfatal errors. Then, just before your target is linked, it will have its source files scanned for style errors. That way, this integrates nicely with IDEs which parse the output of make.

Hopefully this will be useful to other projects. You can import it as a git submodule and then adjust your CMAKE_MODULE_PATH and use it right away. The only caveat is that you need to both add it as a subdirectory with add_subdirectory () and also include it with include () because we need to build a small tool to set the exit status correctly when vera++ exits. I’ve filed a bug about this and hopefully I can get rid of this requirement soon.

On Publishing Private Correspondence

There’s been a fairly proximate drama (which I won’t identify) related to the graphics world, and a lot of it has developed into posting logs of correspondence in public forums.

Initially, it started as posting log snippets of public IRC channels in public fora such as on mailing lists, news forums and publicly accessible webpages. This irked me a little, because while I think that the people who were posting that data were making an important point and needed evidence to do so, publishing chat logs is frowned upon by the Freenode channel guidelines. Those guidelines read:

If you’re considering publishing channel logs, think it through. The freenode network is an interactive environment. Even on public channels, most users don’t weigh their comments with the idea that they’ll be enshrined in perpetuity. For that reason, few participants publish logs.

If you just want to publish a single conversation, be careful to get permission from each participant. Provide as much context as you can. Avoid the temptation to publish or distribute logs without permission in order to portray someone in a bad light. The reputation you save will most likely be your own.

There isn’t anything legally wrong with publishing information obtained from a public fora, just like there isn’t anything legally wrong with recording or photographing and publishing public places. However, its an ethically dubious area. Freenode is not logged by default, and the channel guidelines recommend that where channels are logged, participants should be informed of this fact:

If you’re publishing logs on an ongoing basis, your channel topic should reflect that fact.

The difference between a public fora that is archived and a public fora that is not archived is subtle but important. In the former, it is assumed that participants are interacting in a way that makes their interaction publicly accessible even to those not in the fora. In the latter, the assumption goes as far as making the interaction accessible to those present. In the two fora, people are comfortable communicating different kinds of things. Publishing logs after the fact breaks the assumption in the latter case.

I think the situation got worse once members were taking to forums to publish logs of private conversations on IRC, not in any public channel. To me – this was certainly crossing a line. While I understand that the participants publishing the logs were again making a perfectly valid point, the point in question had already been made and could have been made just with hearsay evidence. I should have spoken up about it earlier, but kept quiet, partially because I’m guilty of doing this myself (but only in a private context).

What really motivated me to write this post is that a prominent journalist within the community published a series of private email correspondence relating to this ongoing drama. While the journalist in question was illustrating procurement of a scheme which I believe is unethical, the journalist in question really should have known better than to publish that private correspondence. Both participants in that conversation had an expectation that the correspondence was private and off the record. The journalist broke that expectation.

Publishing private and off-the-record information is sometimes justifiable. This might be the case where publishing the information would serve as a great public benefit, and raise awareness of a place where the public may be placed at a vulnerability as a result of another person’s actions. However, what distinguishes this case from those is that no actions had been taken to put the public at a special disadvantage. The public didn’t have a right to know, because knowing would not put them in a better position.

To those who are thinking of posting any more private correspondence in public, just don’t. Don’t do it. The point that you wish to make can be made in so many other ways, and it can easily be made by other reputable sources backing up your claim. In addition, by posting this information in public, especially where the information was conveyed with a reasonable expectation of confidentiality, you are exposing yourself to liability in Tort or Equity depending on your jurisdiction. That means that a person can bring an action against you for publishing this information, and in some cases, can be awarded monetary damages.

To the journalist who posted private correspondence in public – Good luck getting anybody to talk to you off the record again. Your credibility is damaged, and it will take a substantial amount of justification for it to be repaired.