Thoughts on graduating

I graduated from university a few months ago.

I didn’t blog about it immediately because I didn’t really know how to feel about it. The emotions that I have around university are certainly complex. Decomposing them helps:

  • Happy: That I got to go to university, learned a great deal and came out a largely changed person. There’s something about six years of taking notes on stuff every day, creating hundreds of pages of study notes, writing tens of thousands of words worth of assignments, meeting new people, organising things and participating everywhere you can that widens your horizons and shows you that whatever you know only scratches the surface. Depending on how you count it, for various reasons less than 10% of the world go to university and so I’m very privileged to have had that experience.
  • Regretful: When I started university in 2010, I started full of ideals. I got involved in a fantastic organisation called UN Youth, went to the debating club’s social debates every week, volunteered everywhere I could, participated in free software, ran for the Guild Council (student union) in the Guild Elections, started an amazing part time job and studied things I was really passionate about. My plans for the next year were even bolder. Then something happened. I think I was Icarus and flew too close to the sun. I started to burn out. I made some mistakes that upset some people and really took the pain that I caused to heart. I felt like a monster. I thought nobody would ever want to talk to me again and so I withdrew socially. I resigned from all my positions, stopped going to events and closed my Facebook account. Its a miracle that I even passed some of my classes for the next two years. Its a miracle that my grades are even halfway decent, though they’re nowhere near as good as they could have been. Over the next five years I found it difficult to get involved with anything and I had a huge difficulty trusting myself not to hurt others. I regret not finding a way past that, because it meant that I couldn’t be as involved as a truly wanted to be.
  • Anxious: University provides a safety net. People seem to give it this intrinsic value where it comes first above all other things. I could use it to escape from commitments people were forcing on to me that I didn’t want. Now that its gone, I need to learn how to be accountable for my own time and how to let other people down when you can’t give them what they want. Its a scary thought and a difficult transition to make.
  • Experienced: Perhaps experienced is the wrong word because university isn’t really a place where you go to get real-world experience. But I think I’m certainly more experienced than I am innocent. My experience at university has taught me about the ways that people can try to manipulate you and what the signs are that you’re ending up in a codependent situation. I’m starting to learn that only you are responsible for setting the direction you want in life and you have to follow your own feelings and not what other people tell you to do. I started out studying a Law degree because I was good at the feeder subjects at school, had the grades to get in and most importantly, its what other people told me to do. I finished my Law degree because that’s what other people told me to do. I didn’t want to disappoint those people, so I  ended up disappointing myself. I always wanted to study something like Software Engineering or Computer Science but I rationalised myself out of it.
  • Frustrated: I’m lucky. I never failed a course and I completed my degree ahead of schedule. But I don’t think I appreciated just how long it would take me when I signed up to do it. I started in 2010 and graduated in 2016. That’s six years worth of study. I took a total of 53 courses – 49 from my main degree programme and 4 in Math and Computer Science out of stream. Each course runs over the course of half a year and I’d typically take four or five per semester. I tried working in law practices for a little while, but I’m not sure if I’m at the right point in my life where I want to do that. I want to make stuff, not facilitate transactions. I’m 24 now and I feel like I’m at the point in my life where I should have had my story straight by now. I’m also wondering where the last three years went.

I’ve actually tried to write a lot of posts where I get these feelings down in writing, but I’ve struggled because I feel like it has to come to some sort of symphonic climax or moment of catharsis. There isn’t one. I’m sure there are lots of other components to the complex feeling that I have about graduating that I haven’t quite identified yet, but I’ll keep trying.

I think I’m also scared about posting this too, even though I want to. I’m scared about who might read it and what they might think of me. I’m worried that I’m not supposed to be feeling the way that I’m feeling and that I should be feeling happy and optimistic like everyone says I should. I don’t though. Maybe I just need to admit that.

I’m also not confident that writing this post and pushing “publish” is going to give me a deep sense of relief or new purpose. The only thing I can do is continue to move forward. I’ll make new commitments, shed the old ones and reflect on my progress in the next year.


Moving from biicode to

Today I moved a bunch of projects over from biicode to It was certainly and interesting experience and I think it is worth talking about conan and what they are doing for the C++ community.

Most established programming languages and runtimes these days have their own de-facto package managers. Node has npm, Python has pypi. CPAN apparently the most important thing to have happened Perl, and so the list goes on.

C and C++ have gone without for quite some time. Some might argue that your distribution’s package manager is really the “true” package manager for systems-level languages. That’s true to some extent, but it is a solution with difficulties in its design. Distribution packages are typically installed systemwide. They require super-user access to be installed. Usually you can only install one version of a package at a time, unless the package is re-named and installed into separate directories such that two installations can co-exist. It is typically also not the maintainer of the software who maintains the package in each distribution, which generally leads to a fragmentation in update frequency and overall slowdown in getting new versions of code out to users.

Language based packaging systems take the opposite approach. The software maintainer maintains the packaging information, which is usually built right into the build system. For most modern languages, its entirely feasible to run development versions of an application in a “virtual environment”, where packages can be installed isolated from the rest of the system. Node takes this approach by the default. Python and Ruby have got virtualenv and bundler respectively. As a part of your build, you can update all your dependencies as once and lock dependencies to particular versions on a per-app basis.

Creating such a system for C++ has been known for a long time to have been fraught with difficulty. For one, there’s no standard build system for C++ and attempts to create the one true buildsystem have all failed. That means that there’s no way to simply build a package manager into a build system that has a wealth of information about every project. Every platform has its own preferred compiler and usually a preferred build system too. There are binary compatibility nightmares. Compiling C++ code takes a long time and it looks like that won’t be fixed until we have modules. Most C and C++ projects were written during the time when we expected that distributions to package everything and so many projects will dynamically link to libraries already installed on the system, systemwide.

Conan is here to try and tackle what seems like an insurmountable problem and they have an approach that is seriously worth checking out. It provides a model that is a reasonable hybrid of what we’ve come to expect from distribution packaging systems and language based package managers. It doesn’t depend on any build system in particular and tries to support all the major ones.

It works by having either the maintainer or someone else write a “conanfile.” A conanfile can be either an ini or python file that describes briefly what the package is about, what its dependencies are and how it is built. One of the really nice things about it is that you don’t have to upload the entire package source code or binaries to the conan servers if they’re already hosted somewhere – just provide a URL to a zip file and some information on how to deal with it. For instance, on each release of my CMake modules, I upload a new package description which links to a download for a tarball of the git tag of that version.

Conan will try to fetch any uploaded binaries that match your system configuration if it can (reducing the binary compatibility problem), but if not, it will rebuild a package from source upon installation. All a package’s dependencies, whether binary or otherwise, are pulled in for your project’s use upon running conan install. Nothing gets installed systemwide. Once conan install is done, it generates a file that can be used by your build system. In the case of cmake, that file sets all of the include, library and cmake paths so that a dependency can be used in a project. Just include and link to it as you usually would and it should all just work. runs their own package registry, but you can also host your own since the server software is open source. Creating and uploading a package is a relatively straightforward procedure. Each version of a package is treated as a unique entry, so an upload of a newer version will not overwrite an older version in case anybody else needs to depend on an older version of a package. A package descriptor in conan might look something like “my-package/version@user/channel.” Everything after the “@” allow for multiple copies of the same package to be maintained by different users if there are modifications those users would like to apply. The channel allows each user to maintain a separate copy of each version of a package if there is a need to subdivide further.

To upload a package, you first need to register it with your “local store” using conan export inside the package directory where the conanfile is located like so:

conan export smspillaz/my-package

After that, you can upload the specified version to conan, which depending on your exports setting, might upload just the conanfile or some other files if there’s no need to fetch the source code from another location.

conan upload my-package/master@smspillaz/my-package

For most of my projects, I only needed to maintain one copy, so it was as as simple as having a version called “master” (which pointed to the most up-to-date tarball) and numerical versions where appropriate. Everything was just under the “smspillaz/my-package” stream.

A dependency can be re-used within a project by specifying its full descriptor (e.g., my-package/master@smspillaz/my-package in the dependencies section of the conanfile).

Overall, I would really recommend checking out conan and looking into making your software available as a dependency, if you’re developing a C++ module that you want others to use. Modules like catch, boost and sfml are already available. There’s no lock in, in the sense that your build process doesn’t have to depend on conan if you start using it, though there’s certainly very little disadvantage in doing so. Hopefully with conan we’ll start seeing a greater proliferation of small C++ modules so that developers and focus on making great applications as opposed to choosing between re-inventing the wheel or managing another dependency across several platforms.

A unit testing framework for CMake

The first question that might pop into your head is why. The answer to that is pretty straightforward – CMake code can get quite complex very quickly. There can be a lot of edge cases based on different configuration options and different platforms.

One popular CMake module, cotire is about 3900 lines long at this count. Cotire provides a simple layer to use precompiled headers across the three main compilers. It has about 75 functions and 13 macros to handle all sorts of stuff, from getting compiler definitions to parsing include trees. Getting that stuff right is hard. Getting it wrong on just one set of options or system definition can cause no end of annoyance for users of your library. Especially for those users left to debug the problem and not familiar with the details of the language.

Over the last year I’ve been working on a unit testing framework for CMake so that module authors can catch these kinds of bugs before they happen. Note that I don’t propose that people start testing their project build definitions as found in the CMakeLists.txt. Those definitions are typically written to be as declarative as possible. Your continuous integration process which builds the project should catch any relevant problems in those build definition files. I’m more interested in testing modules that ship with libraries, or just modules that provide useful functionality to CMake, of which there has been a great proliferation over the last few years.

The framework is called, somewhat unimaginatively, cmake-unit. It supports everything that you’d expect in a typical xUnit-like framework, including:

  • Multiple test definitions per file.
  • A generic cmake_unit_assert_that function which can take pre-defined or user-defined matcher functions to verify that a value matches certain criteria.
  • Automatic test discovery and execution.
  • Suppression of output messages except on failure.
  • Conditional enabling of test cases.
  • XML output of test results.
  • Clean execution slate between tests.
  • Code coverage reports.

There’s currently no support for test fixtures, though in my own testing, I’ve found that they haven’t really been necessary. CMake doesn’t have the concept of resources that need to be managed manually. If shared setup needs to be done for a set of tests, it can be refactored into a separate function and called from the test definition.

CMake presents some interesting problems in terms of implementing a test framework, which cmake-unit tries to accommodate:

  • Multiple Phases: Configuring, building and testing a CMake build-specification is separated into multiple phases, with the state at the end of each phase available only ephemerally before the execution of the next one. The framework allows for custom cmake code to be run for each phase, all contained within the same test. It also allows for variables to propagate across phases of a test.
  • No support for first class functions: The language doesn’t provide a mechanism to call a function by a name specified in a variable. The framework provides a work-around and calling convention encapsulated in cmake_call_function to provide this functionality. This is what makes custom matchers and test-case auto discovery possible.
  • Build system commands operate on source files: Most CMake commands that would  directly affect Makefile generation are not available in CMake’s script mode. Hand writing source files for each test case can be frustrating. The framework provides a mechanism to create a minimal build environment for supported source types and functions to declaratively generate source files.
  • Location of output binaries varies by platform: On some platforms, binaries are nested within a directory specified by CMAKE_CFG_INTDIR. The value of this directory varies by platform and is not readable in script mode. The framework provides a mechanism obtain the true location of a binary and transfer that value between phases.

cmake-unit‘s own test suite provides a great deal of examples as to what tests can look like. The simplest test, which generates a library and executable, then links the two together, looks as follows

function (namespace_test_one)

    function (_namespace_configure)

        cmake_unit_create_simple_library (library SHARED FUNCTIONS function)
        cmake_unit_create_simple_executable (executable)
        target_link_libraries (executable library)

    cmake_unit_assert_that (executable is_linked_to library)

    endfunction ()

    function (_namespace_verify)

        cmake_unit_get_log_for (INVOKE_BUILD OUTPUT BUILD_OUTPUT)

        cmake_unit_assert_that ("${BUILD_OUTPUT}"
                                file_contents any_line

    endfunction ()

    cmake_unit_configure_test (INVOKE_CONFIGURE LANGUAGES C CXX
                               CONFIGURE COMMAND _namespace_configure
                               VERIFY COMMAND _namespace_verify)

endfunction ()

The entire test is encapsulated inside namespace_test_one function. There are two phase that we’re interested in – the configure and verify phases. These are also the only two phases you’ll need in most tests.

The configure phase just looks exactly like a user would use your library in a CMakeLists.txt file. It runs in project-generation mode, so you have complete access to the Makefile generating functions. Since CMakeUnit.cmake has already been included, you can start asserting things right away, for instance, checking before the build even happens whether executable is set up to be linked to library.

The verify phase runs in script mode after both cmake –build and ctest have been run on the project.  A utility function, cmake_unit_get_log_for provides a way to get the full output of both the standard output and standard error of any phase. From there, you can make assertions, either about the state of the build tree or about what was found in the build log.

The final command, cmake_unit_configure_test is a function with metadata about the test. It tells cmake-unit what functions will be used to configure and verify the build process and whether support for particular programming languages should be enabled. It is worth noting that support for all programming languages on each test are turned off by default, since the overhead for some generators to initialise support for those languages can be quite high.

Finally, in your test file, you will need to call cmake_unit_init to start the test auto-discovery process and register files for coverage reports. For example:

The NAMESPACE option tells cmake-unit to look for any functions in the current file  which start with ${NAMESPACE}_test and add them to the execution list. Any files specified in COVERAGE_FILES will have coverage information recorded about them if CMAKE_UNIT_LOG_COVERAGE is enabled.

From there, testing a CMake module is as easy as building a CMake project. Just create a build directory, use cmake to configure the project and discover all the tests, then use ctest to run the tests.

cmake_unit_init (NAMESPACE namespace)


I’ve waited quite some time before publishing this framework, mainly because I actually started it in early 2014 and re-wrote it in early 2015. Since then, I’ve been using it in about ten or so of my own modules and its reached a state of relative stability. I’d like to get some feedback from other module maintainers to see if this project is useful.

You can find the project on biicode on the smspillaz/cmake-unit block. I’ll eventually move everything over to conan once I get a chance. If you need to include it in a non-bii project, you’ll need to copy the dependencies into the bii/deps directory manually.

I’ve been working on some other cool development-related projects in the last year, so I’ll be blogging about them soon. Stay tuned!

Bash substitution and ssh-keygen

Here’s something to note after almost being locked out of an account.

Be careful about bash variable substitution when using ssh-keygen -N. Or better yet, don’t use ssh-keygen -N at all, preferring ssh-keygen -p PRIVATE_KEY_FILE.

The reason why is that the passphrase provided to -N can be modified by reason of variable substitution in bash. For instance, if you had the characters $? in your passphrase as provided to -N, they’ll be replaced with last command’s pid – good luck finding out what that was after trying to unlock your private key a few times.

Performance and cmake_parse_arguments

The only variable “type” that exists in the CMake language is the humble string. The language uses some library code on top of this fundamental type to weakly implement other types, like numbers and lists.

Lists in CMake are implemented as semicolon separated strings. If you wanted to iterate or find something in a list, then you’d tokenise it and work with the tokens. That’s what the built-in list family of functions do under the good.

Function call arguments in CMake are implemented as a list as well. The runtime sets a variable called ARGV in the function’s scope. It also helpfully maps values from that list sequentially to the names passed to function when it was defined. Excess list items in the “call arguments” are put in ARGN. Most of the time you’ll only ever deal with named arguments, but if you want to have a function call with variadic arguments you’ll need to deal with ARGN.

Things start to break down when you want to pass lists to functions. If you want to pass a value directly to a function, so that one of its arguments contains the value you just passed, then usually you would dereference the variable in the function call, like so:

function_call (${MY_VARIABLE})

Things start to break down when you want to pass a list. CMake parses space-separated identifies as a “list”. If you dereference two list-containing variables next to each other, you get a single list. This makes cases like the following (which are perfectly reasonable) work the way you expect:

set (MY_LIST

When this code runs, CMake sees something like this:

set (MY_LIST

Unfortunately, this makes life hard when you want to call a function:

endfunction ()

my_function (${MY_LIST} ${MY_STRING})

When MY_LIST and MY_STRING get expanded, CMake sees a single list, as follows:

my_function ("ITEM_ONE;ITEM_TWO;STRING")

And when CMake maps everything to variable names:


This is almost certainly what you would not expect. After all, the two variable dereferences were space separated and looked like they were intended to fill two separate arguments. Alas, that’s not how CMake sees things. Its just one big flattened list.

There’s a few solutions to this problem, but they all require the caller to keep track of when the intention is to pass a list as opposed to a single item of that list.

The first option is to quote the variable dereference at the call-site.

my_function ("${MY_LIST}" "${MY_STRING}")

The second option is to pass the name of the list as opposed to its value. This works because scopes have runtime lifetime as opposed to structural lifetime, so any live variables on the stack prior to the function call will also be available in that function’s body:

my_function (MY_LIST ${MY_STRING})

The third option, which appears to be the most prevalent, is to use a system of keyword arguments to denote what values as opposed to map to which names:


The idea at this point would be to loop through all the items in ARGN and use the “markers” to determine where to set or append values. That’s exactly what cmake_parse_arguments does. However, as with most things its always a question of trading usability for performance, and the performance implications can get very scary very quickly.

cmake_parse_arguments has a concept of “option arguments”, “single value arguments” and “multi value arguments”. If I were to use a table to summarise:

option arguments: Set to `ON` or `OFF` depending on whether name is present.
single value arguments: Set as “active” when encountered. Active variable is overwritten with subsequent values until another variable becomes “active”.
multi value arguments: Set as “active” when encountered. Subsequent values appended until another variable becomes “active”.

In order to implement this, you need to iterate all the values in ARGN (N) and then check whether any one of them matches a marker in either the option (M), single value (O) or multi-value arguments (P). So its O(NMOP). It gets really slow when you start passing the contents of long lists as the “value” to a multi-value token.

As an example, I just finished doing some profiling on a project I was working on, where CMake was taking a long time to run. Profiling indicated that cmake_parse_arguments was taking 38 seconds to run, which is absurdly long. I was calling cmake_parse_arguments to pass each line from a file I had just read using file (STRINGS ...). It so happened that this file can be quite lengthy in some circumstances, which meant that cmake_parse_arguments had to do a lot of needless parsing. It was just faster to pass the filename in the end and open it in the local function. Making that change cut runtime to a few milliseconds.

As a general guideline, I now think that cmake_parse_arguments should probably be used sparingly, when you don’t expect callers to give you a huge number of arguments. The way it works was always inherently going to be quite CPU-intense. If you’ve got a slow-running project, then passing too much stuff to cmake_parse_arguments may well be the culprit.

Creating mini-distribution containers with “fake” root access on travis-ci

For most of the projects I’ve started in the last two years, I’ve been using a service called Travis CI. Travis CI is a free-for-open-source projects continuous integration service which runs in the cloud. Its also really easy to set up – just specify a list of steps (as shell commands) in a .travis.yml in the root directory of your project. If they all return with a successful exit code (e.g., 0), then your build is considered passing. Travis CI runs the same script on every new revision of your project and on all its pull-requests. It ties into GitHub’s status notification API and is just all-round super useful.

What makes Travis CI so easy to use for all kinds of projects is that for each build you get an OpenVZ virtual machine with full root access, based on Ubuntu 12.04 LTS. You can use apt to install all your dependencies, add new repositories to your hearts content, download arbitrary files, execute arbitrary code, etc etc.

Moving to Containers

One of the big downsides of virtual machines though is that they import a considerable amount of overhead. In order to provision them on demand, you need to have a a bunch of running instances in memory that you can dispatch jobs to. Once a job is completed, you need to roll back the disk image to an earlier state, kill the instance and clone a new one from an existing “clean” image. Keeping all these virtual machines around consumes a non-trivial amount of resources and in the cloud that costs money. We’ve recognised that virtual machines are not really the way to go for the future of the cloud for a little while now, and more lightweight “container” solutions like Docker and LXD are seeing increased adoption.

Container based solutions are kind of like a “chroot-on-steriods”, in that they provide a way to run a (more or less) isolated user-space on top of an existing kernel, just like any other process. There’s very little overhead involved. Travis CI recently started rolling out infrastructure based on Docker, where builds can be provisioned in seconds as opposed to minutes. I’ve tested this on some of my own projects and it really is true – in non-peak times builds have been provisioned within five to ten seconds of pushing code, and in peak times, about 30 seconds. That is an insanely good turnaround time for continuous integration.

Problems with Containers

The caveat with contained based solutions, however, is that everything runs as a much more restricted user. You don’t have access to sudo and as such you don’t have access to tools like apt. This makes doing a lot of the build tasks which were easy to do on the OpenVZ based infrastructure almost impossible on containers. Travis CI has suggested using precompile binaries uploaded to S3 and downloaded as part of your build process as a replacement for apt in the time being. That’s not really an ideal solution, especially when you want to track a rolling package release cycle.


I was quite keen on switching over as many of my builds to the container based infrastructure as possible. But the lack of root access was going to be a bit of a problem as most of my builds require the installation of packages outside the default install set.

I initially had the idea of using debootstrap to create a chroot inside the container where I could install my own stuff, just like how a pbuilder works.  Unfortunately both chroot and debootstrap require root access.

I did, however, come across another interesting project which could fill the niche quite well. PRoot (short for ptrace-root) is a project that uses the Linux ptrace utility to hook system calls and effectively pretend to be the root user operating on another root filesystem. This works quite well in the majority of cases – applications think that they are running as the root user and also believe that the directory you pass to the proot command is the root directory.

Most linux distributions ship a “minimal” or “core” version – usually a few megabytes, which contains the bare necessities to bootstrap and install packages, but is otherwise a fully-functioning, booting filesystem. This can be extracted to a subdirectory and used directly with proot. An added bonus is that the proot authors have added support for Qemu user space binary translation, which means that you can download a distribution root filesystem for another CPU architecture and have its code dynamically translated to run on the host architecture directly.

Using proot, it is possible to create a mini-distribution where apt can be installed to install whatever packages you want to install, and to run and link to the resulting packages inside the mini-distribution. This was perfect for use with travis-ci’s containers.

Incidentally, Travis CI also enabled build caching for projects running on the container based infrastructure. This mean that you can cache the directory the mini-distribution was created in between builds to avoid having to download and install packages in it all the time.

Introducing Polysquare Travis Container

I wanted to make this functionality easy to use for people looking to move to the container based infrastructure, so I’ve created a project called polysquare-travis-container on GitHub. It isn’t available on PyPI, but you can install it with the following:

pip install git+

Two commands are available. The first, psq-travis-container-create allows you to create a mini-distribution in a specified directory. It automatically downloads proot and qemu for your CPU architecture. The –distro, CONTAINER_DISTRO environment variable allows you to specify the name of a Linux Distribution to use (Ubuntu, Fedora, Debian). The –release, CONTAINER_RELEASE option and environment variable allow you to specify the name of the release to use. –arch, CONTAINER_ARCH are used to specify a target CPU architecture.  You can also specify –repositories PATH_TO_FILE and –packages PATH_TO_FILE to specify files containing lists of repositories and packages to be installed inside that mini-distribution.

If a container exists in the specified directory with that configuration, it will be retained and nothing will be re-downloaded. This allows you to seamlessly integrate the mini-distribution with the caching system.

psq-travis-container-exec can be used to execute commands inside a container. It reads the same options and environment variables as psq-travis-container-create as well as an optional –cmd to specify the command to run. The command is looked up in the mini-distribution’s PATH, so –cmd bash would run the mini-distribution’s version of bash and not the host’s.

This is what it looks like on your build output:

✓ Using pre-existing proot distribution
Configured Distribution:
 - Distribution Name: Ubuntu
 - Release: precise
 - Architecture: amd64
- Package System: Dpkg
✓ Using pre-existing folder for distro Ubuntu precise (amd64)
✓ Container has been set up in /home/travis/container

Concluding Notes

I hope this project is useful to anyone who was thinking about moving to the new container based infrastructure after it was announced late last year. I’ve already started using it for one of my own projects (which I’ll post about later) and I plan to move many more to it in future.