Category Archives: Uncategorized

Buffer your IO on CMake!

A small CMake optimisation win that’s probably worth sharing.

Calling

file (APPEND

can massively impact your configure time performance. Especially on Windows, where it appears that filesystem writes are unbuffered (at least this is the case for CMake on Windows or Windows generally. I’m not sure).

BufferedIO.cmake

set (MY_FILE_CONTENTS)
foreach (VAR RANGE 500)
    list (APPEND MY_FILE_CONTENTS "${VAR}\n")
endforeach ()
file (WRITE output ${MY_FILE_CONTENTS})

AppendEverything.cmake

set (MY_FILE_CONTENTS)
foreach (VAR RANGE 500)
    file (APPEND output "${VAR}\n")
endforeach ()
$ time cmake -P BufferedIO.cmake

real 0m.032s
user 0m0.000s
sys 0m0.000s

$ time cmake -P AppendEverything.cmake

real 0m.657s
user 0m0.000s
sys 0m0.000s

Ouch.

Using debuggers in everyday life

One of the things that I teach students that I mentor is that an understanding of how computers work, at least on the lower level, gives you an unexplainable edge. Its like being a doctor, or a lawyer or a mechanic, where your field of specialty (which is really something that you interact with in everyday life) stops becoming a black box of magic and starts becoming something you are empowered to interact with in a meaningful way.

Recently I was drafting up some important documents for someone close to me and we had spent hours and hours working on. Every detail was critical. Unfortunately, our intense concentration meant that we forgot to save the document and this happened after attempting to insert a new paragraph:

Screengrab

The application had hung completely and looked like its was stuck in an infinite loop. We’d have to throw away the three or four hours of work as there was no way we were going to get the contents of that (now multipage) document.

Or was there?

It turns out that having been a hobbyist programmer for years, I knew a thing or two about debugging. I also knew that when you type something into a document, the contents of that document are usually somewhere in memory – and most simple text editors would probably just store the document’s contents in memory as plain or marked up text. That’s what I’d do if I wrote a simple text editor.

If we could just search the program’s memory for the contents that we can still see on-screen, then we might be able to retrieve a fair bit of the document.

It turns out that debuggers are well suited to that task.

LLDB Start

 

The debugger I’m using is called LLDB and it come with Xcode. It’s part of the LLVM suite of tools.

LLDB like most debuggers has a scripting capability and there’s a script shipped by default that lets you search heap memory for a matching string called cstr_refs.

cstr_refs

 

Just by searching for the very first string, we found an allocation of 320 bytes.

If its just a simple nul-terminated string all the way to the end, we can just cast the pointer to a string and print out the rest of it.

Printing out string

 

There’s our recovered contents!

Sometimes we’re not so lucky and we don’t get a null-terminated string. Or the memory ends up being fragmented across a few different pools.

When that happens we can just read memory, so long as its within the program’s mapped range, with the memory command.

memory read

You can use debuggers for all kinds of cool stuff, including momentarily getting back those few hours of lost work.

HasNoExceptMemFun type trait

When creating policy based classes we can use std::enable_if with SFINAE and type traits as a mostly-works substitute for real concepts in order to enforce certain things about those policies. This is somewhat like the override keyword that clients can have to ensure that they conform exactly to part of an interface, including const ‘ness and noexcept .There does not appear to be any type traits to determine if a type’s member function is const or noexcept . This is important if you want to call a template argument’s member function and need to maintain a const or noexcept guarantee. noexcept in particular is particularly easy to get wrong as one could inadvertently call a function which throws an exception inside another noexcept func, risking calls to std::terminate .By mucking about with std::declval and perfect forwarding we can test a member function to see if it is noexcept .

template <class Object, class MemFun, class... Args>
struct HasNoExceptMemFun
{
    static constexpr bool value = noexcept (((std::declval <Object> ()).*(MemFun ()))(std::declval <Args> ()...));
};

template <class Object, class MemFun>
struct EnableIfHasNoExceptMemFun :
    public std::enable_if <HasNoExceptMemFun <Object, MemFun>::value>::type
{
};

template <class T,
          typename = typename EnableIfHasNoExceptMemFun <T, decltype (&T::MyFunc)>::type>
struct MyTemplate
{
    void MyFunc () noexcept
    {
        T t;
        t.MyFunc ();
    }
}

One might be tempted to say gesundheit after seeing something like that, but it can be decomposed fairly easily:

class... Args

is what’s known as a parameter pack and effectively allows you to have an arbitrary number of parameters to a template. We use this to grab the types of the arguments to the member function.

noexcept ()

Generates a const-expression based on whether or not a particular operation is noexcept. If any operation within noexcept () is not noexcept (true) then noexcept () will return constexpr false.

std::declval <Object>

Is the magic that allows us to skip having to “construct” Object in order to “call” MemFun inside the noexcept block. It simply turns a type into an rvalue reference. One might be posed to ask why such a seemingly unsafe construct (turn nothing into an rvalue reference?!) made its way into the standard, but it is effectively a function that does nothing in practice and is only there to help the compiler. With our new eval reference, we can go ahead and call MemFun

((std::declval <Object> ()).*(MemFun ()))

Since the rvalue reference returned by std::declval is a temporary, we need to put it in brackets. .* is the pointer-to-member operator which allows us to access some member by providing a pointer to it. Since MemFun is a type and not a pointer we need to “construct” it and also put it in braces since it is a temporary. Finally the entire operation needs to be put in brackets as operator() takes precedence over operator.*.

(std::declval <Args>...)

Turns out that std::declval can be used with parameter packs too, so we’re going to get “free” rvalue references to all of MemFun‘s parameters.

Now that we’ve “called” MemFun with Object the value will be stored in the static bool const-expression value.

From there we can use it with std::enable_if.

std::enable_if <HasNoExceptMemFun <Object, MemFun>::value>::type

std::enable_if is basically one of the closest things we have to concepts at the moment. Effectively, it creates an erroneous substitution if the value passed as its first argument is false and a valid substitution if the value passed as its first argument is true. The second argument, if specified, represents the type that the nested typedef type will be.

Remember that HasNoExceptMemFun will be true if the arguments are a no-except member function of that type and false if they aren’t. Thus, it is only in the true situation that we’re able to inherit from type.

In order to use this, we define a small helper as follows (helps to avoid really long lines):

template <class Object, class MemFun>
struct EnableIfHasNoExceptMemFun :
    public std::enable_if <HasNoExceptMemFun <Object, MemFun>::value>::type
{
};

And then we use that helper like so

template <class T,
          typename = typename EnableIfHasNoExceptMemFun <T, decltype (&T::MyFunc)>::type>

Remember that the MemFun template parameter is a type and not a pointer so we need to convert the actual member function pointer which we wish to check to its type. We can do that using decltype.

From there, we can finally be sure that in calling that member function of the template parameter that we’re not running the risk of having exceptions thrown, because we can guarantee that the function is noexcept.

Feel free to play with it here.

Iterator Invalidation

You might be fooled into thinking by using std::vector and avoiding bare arrays created with new T[] that you’ll be avoiding all possible old-fashioned pointer-misuse bugs from the C era for that variable, for instance, null pointer dereferences, out of bounds access, memory corruption, etc. Not so – if you keep iterators  and references around to elements of a vector and then do anything to change its size, the standard says that those iterators and references may be invalidated by that operation(*). More importantly, you don’t get a warning about this and static analysis tools like cppcheck might not detect it in all cases (for instance, collection is modified by a function further down the call stack).

At least for operations like push_back which only affect the end of a group of iterators – there’s a simple trick you can do to save and restore a group of iterators after such an operation.

namespace detail
{
    /* Main implementation of KeepIteratorsAlive -
     * counts down through all the iterators through
     * a recursive call until we get to N = 0, at which
     * point call the function in question */
    template <typename Function,
              typename Collection,
              typename IteratorTuple,
              size_t   N>
    struct PreserveIterator
    {
        IteratorTuple
        operator () (Function      const &function,
                     Collection          &collection,
                     IteratorTuple const &tuple)
        {
            /* We need to call std::begin twice as the value of it
             * may change after we've called our iterator-modifying
             * function */
            auto distance = std::distance (std::begin (collection),
                                           std::get <N - 1> (tuple));
            auto result (PreserveIterator <Function,
                                           Collection,
                                           IteratorTuple,
                                           N - 1> () (function,
                                                      collection,
                                                      tuple));
            std::get <N - 1> (result) = std::begin (collection) +
                                        distance;
            return result;
        }
    };

    /* Specialisation for N == 0, at this point
     * we call the function and return an empty tuple
     * to fill up with the correct iterators again */
    template <typename Function,
              typename Collection,
              typename IteratorTuple>
    struct PreserveIterator <Function, Collection, IteratorTuple, 0>
    {
        IteratorTuple
        operator () (Function      const &function,
                     Collection          &collection,
                     IteratorTuple const &tuple)
        {
            function ();
            return IteratorTuple ();
        }
    };
}

template <typename Function,
          typename Collection,
          typename IteratorTuple>
IteratorTuple
KeepIteratorsAlive (Function      const &function,
                    Collection          &collection,
                    IteratorTuple const &tuple)
{
    typedef Function F;
    typedef Collection C;
    typedef IteratorTuple T;
    constexpr size_t const S = std::tuple_size <IteratorTuple>::value;

    return detail::PreserveIterator <F, C, T, S> () (function,
                                                     collection,
                                                     tuple);
}

You can use it like

std::vector <MyType> vector;
vector.push_back (MyType ());
auto it = vector.begin ();
std::tie (it) = KeepIteratorsAlive ([&vector]() {
                    vector.push_back (MyType ());
                },
                vector,
                std::make_tuple (it));

As for reference and pointer stability – this is more difficult. You can either wrap the object type in std::unique_ptr (and take a reference to it) or std::shared_ptr (and make a copy of the std::shared_ptr), which means you’ll pay the cost of another allocation and indirection costs on access, or you can assign each object a unique identifier by which you can use to look it up in the std::vector later.

If you want to avoid messing around with iterators, additional allocation or identifiers and in general iterator, reference and pointer stability is more important than performance then you can use boost::stable_vector

The main difference between the two is that boost::stable_vector is not contiguous, so if you have a lot of internal fragmentation (for instance, in these tests, 2 random erases for every three insertions) it can be a lot slower than std::vector. If array access performance is more important than insert/delete, then iterator adjustment and identifier tracking above can be used.

[smspillaz@平和猫とケーキ build (work)] $ ./src/benchmark 2000
2000 of boost::stable_vector (cold)
 0.000143s wall, 0.000000s user + 0.000000s system = 0.000000s CPU (n/a%)
2000 of boost::stable_vector (warm)
 0.000111s wall, 0.000000s user + 0.000000s system = 0.000000s CPU (n/a%)
2000 of std::vector (cold)
 0.000048s wall, 0.000000s user + 0.000000s system = 0.000000s CPU (n/a%)
2000 of std::vector (warm)
 0.000040s wall, 0.000000s user + 0.000000s system = 0.000000s CPU (n/a%)
2000 of std::list (cold)
 0.000104s wall, 0.000000s user + 0.000000s system = 0.000000s CPU (n/a%)
2000 of std::list (warm)
 0.000069s wall, 0.000000s user + 0.000000s system = 0.000000s CPU (n/a%)

(*) Detailed by Louis Feng here.

Happenings

Bom dia!

University has all finished up this year. I’m really happy with the grades I’ve managed to achieve this semester – two of the highest grades I’ve ever achieved in law units. Bombed one exam, but still managed to do okay in the unit overall. Somehow managed a High Distinction for a Communications unit I thought I did awfully in. The mysteries of scaling.

I’ve been a little absent from lots of communities lately, so people might be wondering what I’m up to.

Well – I’m currently living here in Brazil for two months, involved with an organisation which I think is going to be a real gamechanger. I can’t say who they are or exactly what they’re doing – but they’ve got a very, very, very big goal and a very talented and enthusiastic team to bring that vision about.

I hope to get back (and report back) on what I’ve been doing with polysquare over this year as well. Unfortunately, it hasn’t been much due to the pure amount of study load I’ve had recently.

Tchau!

Server Grabs and getting stuck in _XReply

Here’s a quick reference to anyone in the future who ever gets stuck debugging some external library being stuck in _XReply

#0 _XReply

#1 ???

#2 some_library_do_thing

# 3 my_app_do_thing

Check if your app is using server grabs (XGrabServer, XUngrabServer).

Some third party libraries open up new X Server connections in the same process as yours. They shouldn’t do this, but they do anyways (graphics drivers are notorious culprits for this).

If you are holding a server grab and then call into a third party library which makes a synchronous call on its own X connection you will get a deadlock. This is because of how server grabs work – they prevent all connections from reading or writing any data until your connection has released the server grab.

When using server grabs (please don’t unless you really need atomic semantics around a series of X calls), always try to limit them to the closest possible subset of protocol requests you actually need atomic semantics around. Don’t just wrap some entire function in a server grab. Especially not if you plan to be calling callbacks which you don’t control or calling into libraries that are subject to change.

Also remember that XUngrabServer is not a synchronous operation – you will need to call XSync after using it in order to completely release the grab for other connections.