New Old Buffers

The GLX_EXT_buffer_age extension is gaining some traction. Now that a spec has been published for GLX (and EGL), we’ve seen implementations appear first in the nvidia driver and now patches have appeared for mesa. In short, these patches allow compositing window managers to retain high performance with low-frame latency and zero tearing.

edit: I’m advised that the mesa patchset is EGL-on-wayland/KMS only at the moment.

For the curious, here’s an explanation of how all of this is done.

OpenGL wasn’t written with compositing window managers in mind. Instead, it was designed for 3D graphics software (think CAD) and games. Typically, these applications ran full-screen and used 3D models which needed be redrawn completely on every frame.

A common problem in graphics software is tearing which happens when the process stuffing pixels into memory isn’t synchronised with another process that pulls those pixels out of memory and on to a display device of some sort. A race condition can occur where the display device shows a half finished image because the first process hasn’t finished writing it.

Double buffering was mostly designed to solve this problem. But it comes at a penalty, either in memory usage and performance, or latency. How? Lets see:

The Problem:

The problem that double buffering presents is quite simple: You have two buffers, and a monitor running at XHz. The application fills buffer A, marks A as the “front” buffer and then fills up buffer B, all before the monitor even has a chance to display A. What now?

Solution 1: Wait around for the monitor

The introduces latency into your application. It essentially means you have to block until the next vertical blank period, and then start rendering into buffer A again. One of the advantages of this approach is that you already know what’s in buffer A, so if only a little bit changed, you can just update that little part. Unfortunately OpenGL never provided a deterministic method of knowing that buffer A actually has what you might expect in it (the specification says it is “undefined“, and we’ll see why soon).

As such, most applications tend to emulate this approach by never actually doing buffer “flipping”, but instead copying data from the back-buffer into the front-buffer just after the monitor has displayed the last frame. So it might look something like this:

glFinish (); // Wait until the backbuffer is completely filled, this blocks the CPU
unsigned int oldCount = counter; // Frame counter

// wait until the monitor has displayed the last frame
    glWaitVideoSync (1, 0, &counter);
while (oldCount == counter)

// draw directly to front buffer
glDrawBuffer (GL_FRONT);
glReadBuffer (GL_BACK);
// copy from backbuffer to front buffer, either using hardware blitting with glXCopySubBufferMESA / eglPostSubBufferOES or texturing with glCopyPixels

// start drawing next frame to backbuffer
glDrawBuffer (GL_BACK);

This is what compiz did until earlier this year. As far as I’m aware, this is how mutter and kwin also do it, and for good reasons.

Unfortunately this approach has its problems:

  • Doesn’t offer true tear-free rendering: glXCopySubBufferMESA, eglPostSubBuffer and the fallback involving glCopyPixels are not atomic operations. What happens if someone interrupts your process halfway between these? Then you get tearing.
  • Have to block the CPU: There are two places we have to block the CPU here: in the first glFinish call and also in glWaitVideoSync. In the worst case scenario, you might just miss the next frame and have to wait a whole other frame for glWaitVideoSync. That’s no good.

Solution 2: Render lots of frames, and let the GPU handle them

As I mentioned earlier, glXSwapBuffers does not give you a guarantee of a defined backbuffer. That means that you’ll be at your own peril of garbage-on-screen unless you touch every single pixel in that backbuffer.

The reason for this is two fold.

First of all, for frames A, B and C, you might render A and B before the monitor has even gotten a chance to render A. Now in a double buffered scenario, you have two choices, wait around until A is available, or grab a new buffer C and start using that. The committee that designed OpenGL recognized that waiting around for A is a sub-optimal solution, so they allow implementations to just give you “an available frame”, whether that be A or C.  Of course, C does not contain A’s contents, so it is “undefined”.

Secondly, if your window is resized, then you get entirely new buffers to render into, it doesn’t make any sense to rely on the backbuffer after glXSwapBuffers being defined in this case.

That being said, the true-double-buffering approach has two key advantages:

  • Zero tearing: The front-to-back swap is truly atomic – changing a pointer. The monitor gets either the old frame or the new frame, and nothing in between.
  • No waiting around: Because the GPU is catching up with the CPU sending it commands, it can just put frame A as the “front buffer” when ready, and then frame B, and then frame C, since it intimately knows the monitor vertical blank timings.

After some consideration, this is the approach we decided to go with in compiz earlier this year.

But what about the undefined backbuffer problem? How do we avoid redrawing everything on every frame?

Well, there’s a clever way around that, albeit at a slight performance hit: you can “emulate” a defined backbuffer by having ownership over the buffer that you render into, eg a framebuffer object.

// we're going to read out of it later
glBindFramebufferEXT (GL_READ_FRAMEBUFFER, someFBHandle);
glBindFramebufferEXT (GL_DRAW_FRAMEBUFFER, someFBHandle);
// render scene, with the knowledge that the framebuffer contains exactly the same contents as the last frame

// we're done here, render to the backbuffer again
glBindFramebufferEXT (GL_DRAW_FRAMEBUFFER, 0);
// Do a fast "blit" operation from the framebuffer object to the backbuffer and redraw every pixel there
glBlitFramebufferEXT (0, 0, screenW, screenH, 0, 0, screenW, screenH, GL_COLOR, GL_LINEAR);
// This frame is done, ask the GPU for a new one
glXSwapBuffers ();

This approach is one of many to handle this problem. The other involves doing the buffer swap and then scraping the contents of the frontbuffer on to the backbuffer, but I haven’t shown it because it requires the CPU to block and may involve copies into system memory. Framebuffer operations all happen on the GPU.

Of course, the problem here becomes fill-rate. GPU’s can touch a certain number of pixels on every frame, doing it this way means that you have to touch WxH pixels on every frame. Once you exceed the fill rate, things slow down in a linear fashion and you start missing frames.

Enter Old Buffers

What GLX_EXT_buffer_age allows us to do is query the age of the backbuffer in terms of frames. This essentially means the old problem, where the backbuffer was undefined after glXSwapBuffers is now gone. Its done by querying the GLX_BACK_BUFFER_AGE attribute on the backbuffer after a swap, and doesn’t involve stalling the CPU because its all stored in the OpenGL context on the client side.

But what does knowing the age of a backbuffer mean for us?

Let say that we have a window with four equal sections, A, B, C and D. The render loop for this window is really simple, we just draw into A on frame 1, then B on frame 2, C on frame 3, D on frame 4 and then back to A on frame 5 and so on.

We want the effect here to be cumulative. So frame 1 should have A rendered, then frame 2 A and B and so on.

If the backbuffer was completely undefined, on frame 1 we’d render A, (nothing), (nothing), (nothing), (nothing) and then on frame 2 we’d render A, B, (nothing), (nothing) etc.

If the buffer age is 1 (eg, last frame) on frame 2, we already know that A, (nothing), (nothing), (nothing) is there, so we just render B.

If the buffer age is 2 on frame 3, then we know that A is there, but B is not yet there, so we render B and C.

If the buffer age is 2 on frame 4, then we know that A and B are there, but C isn’t, so we do C and D.

If its 3 on frame 4, then we render B, C and D, since only A is there.

Drawing individual regions of the screen works something similar to this in compiz. Ideally when a blinking cursor changes, we only want to redraw just that cursor and nothing else.

Having buffer_age means that we can now get rid of those costly framebuffer binds, and that constant-time glBlitFramebufferEXT, which scales poorly. Problem solved.

Other odds and ends

I’ve been continuing work into making this work well with compiz and unity, since I believe that so far, its our biggest performance bottleneck. Part of making sure that works is that when drawing stuff into the backbuffer, you actually have to mark it as “damaged”, so we can redraw it properly if we get a frame that’s older. Unfortunately, there were some plugins *cough*unity*cough* that didn’t respect this, and just redrew everything to the backbuffer regardless of the damage region. So I’ve been busy hacking support for partial redraws back into nux and unity (it used to be there in previous versions, but was removed once we switched to method 2 earlier, because this problem would cause bleeding and other nasty artefacts).

I’ve mostly got it fixed up now, so I’d appreciate some more testing. And I’m sure you’ll appreciate the FPS increase 🙂


37 thoughts on “New Old Buffers

    1. Its raring-only at this point.

      There are old packages in there for Q, and that’s a bit misleading, so I’ll delete them now.

      edit: I’ve rebuilt the packages for Q, hopefully they’ll install this time.

      1. Sam the ppa still doesn’t work for quantal:

        The following packages have unmet dependencies:
        libunity-core-6.0-5 : Depends: libunity-protocol-private0 (>= 6.90.2bzr205pkg0quantal25) but 6.12.0-0ubuntu0.1 is installed.
        unity : Depends: libbamf3-1 (>= 0.4.0) which is a virtual package.
        Depends: libunity-protocol-private0 (>= 6.90.2bzr205pkg0quantal25) but 6.12.0-0ubuntu0.1 is installed.

        Can you fix this please? Thanks for your hard work!

        1. Hmm. Do you have the unity-staging ppa enabled? (ppa:unity-team/staging)

          compiz-experimental depends on staging. Unfortunately I don’t have a quantal installation to check it on right now.

          The output of your apt-cache policy libunity-protocol-private0 would be useful too.

          1. I didn’t have unity-staging ppa. I enabled it and i can install. I still have dependency problems with libbamf3-0 and libbamf3-1 and bamfdaemon but i think this has something to do with unity ppa and not yours. Anyway i can install compiz if i remove indicator-appmenu package. I can live without global menus for a few days while i test compiz.

            Great job Sam!

  1. I have been testing your ppa very intensively and I experience a few things I cannot explain. I would be glad if you could tell me what could be the reason for this strange behavior. I was not even able to determine exactly under which circumstances it happens:
    Sometimes I still experience tearing effects in videos. I even booted Windows 7 to determine if the same effects occur there and they did not. First I thought it was a flash issue, but then I saw the same effect for html5 videos. I often quit full screen and afterwards watched it again fullscreen and suddenly I was not able to see the tearing effects on the same time in the video. But I also experienced some tearing effects which always occurred at the same time in the videos.
    According to your explanation tearing should not be possible. How to find an explanation for those effects?
    It does not matter which applications I play the videos with, with your true-double-buffering approach tearing should not be possible, or am I wrong?
    Another issue I found was, that I was not always able to get and leave fullscreen with google chrome browser (I have not tested firefox though) for both flash and html5 videos. Sometime the launcher and the panel are still visible. But same as tearing effects it does not always happen. If for example I minimize and afterwards again maximize the app then fullscreen suddenly works.
    The same goes for closing the system settings window. Sometimes while I close it I can see black areas around the window, but I am not able to determine a clear instruction to reproduce it.
    I experience this problems on latest Nvidia 313.x drivers.

    1. I’m not sure about the black areas around the decorations upon closure. I suspect that might be a driver problem, but I’d have to look into it in more detail. Thanks for the feedback.

  2. In fact, if I use Firefox I am not able to see any tearing effects (Though, as you can see by the time elapsed since my first comment, I have not tested it heavily).
    So the main question stays: Could tearing be still possible with the new approach? Does it depend on the application used to play the videos?

    1. Are those videos fullscreen?

      If the window is unredirected then the compositor doesn’t have any control over it. We should be redirecting fullscreen video windows but we don’t cover all the cases yet.

      Try disabling “unredirect fullscreen windows” in the composite plugin.

      1. That being said, there could be a bug or two in the fullscreen unredirection implementation which was caused by a change I did in this branch. I’ll have a look into it – thanks.

      2. 1. Yes the windows were fullscreen. I will try to disable the unredirect fullscreen windows and report back.
        2. The black area problem does only occur for certain windows, so for every gnome control center window, meaning for gnome control center and all it’s subcategories windows as sound settings etc. All other windows closed correct so far.
        If I ask myself what makes control center and all it’s subcategories unique, I notice that they are not resizeable. So I am assuming this behavior happens for all windows which are not resizeable, but I have not tested this.
        I suppose if it was a driver issue, it would happen for every window.

        Another bug I can confirm for the ppa is :
        and :

        I tested placing windows very intensively and also had a look at bugs tagged with compiz-experimental-ppa and among them one issue is worth mentioning:
        which seems to exist since unity was invented (this bug annoys me since 11.04).
        I can also confirm the comment added by “Esokrates”.
        If my programming skills keep getting better I hope to being able to help with fixing bugs some day.
        Thank you for your responses!

        1. I think I saw bug 1037164 the other day.

          I’ll have a look into it. This ppa touches that area of the code, so I wouldn’t be suprised by regressions happening there.

          1. I have taken videos demonstrating the most annoying bugs every user will experience after a short time (Note: no shortcuts were used except in video 1, changing workspaces):

            The first video demonstrates bug 1037164:


            I have to add something to the bug report: If the window is placed in a corner and I use the workspace switcher in the launcher, the window gets placed on the workspace I switch to, as shown in the video. However, if I switch by shortcut, and then access the window with the launcher, the window gets moved to the workspace it was accessed from, so the behavior is not exactly the same as for the semi maximized windows.
            I noticed, that the bug also works for “up and down”, but only in a special case as shown in the video: if I open the application an semimaximize it, the bug does not work, but if I close the application and reopen it, it has the same position and I access it from the upper workspace the same behaviour is shown, as for the right left issue. (Note: the window has to have exactly the same size after reopening, otherwise the bug does not seem to appear. You cannot reproduce it with a terminal window)

            The second video demonstrates bug 1093757:


            This bug consists of two issues:
            1. The space issue is only true for specific applications as for the terminal.
            2. The maximize issue is true for all applications. It only occurs for snapped windows in corners and semi maximized windows (but here in a from similar to the one shown in the next video, bug 1093767).

            The third video demonstrates bug 1093767:


            The fourth video shows another bug I noticed for your ppa:


            Here you can see the overlay scrollbar from the dash (it occurs on the exactly same position) if you acces the workspace switcher after you opened dash in a category, where you can scroll.

            The fifth issue is an blur issue. The video was taken after a system startup:


            You can see, on a freshly booted system, blur works, but after opening the dash, the blur only shows how it looked when the dash button was clicked.

            The sixt video demonstrates the black area issue mentioned by David:


            In order to make the experience for 13.04 optimal, it would be nice if the first three bugs would get fixed, especially the first mentioned problem, as it interferes with effective working.
            I have noticed another issue, but I have to determine exactly when it happens:
            Sometimes a window gets black during the minimize animation:


            1. Thanks for the feedback.

              Can you put this commentary and video links in each of the relevant bugs you mentioned? Makes my life a bit easier. I’ll get on to them as soon as I have time.

              As for the blur thing – what hardware are you running on?

      3. What do you mean by saying:
        “edit: I’m advised that the mesa patchset is EGL-on-wayland/KMS only at the moment.”?
        Is the zero tearing approach now implemented for compiz or not? Sorry to ask, but English is not my first language and I am not so familiar with compositor stuff.

        I tried to disable unredirect fullscreen windows as you advised me and tearing completely disappeared for the videos using chrome. Great work!

        “We should be redirecting fullscreen video windows but we don’t cover all the cases yet.”
        Does this mean that disabling the “unredirect fullscreen windows” is experimental rigth now, but definitely a long term goal?

    2. Ah, sorry, I missed the last part of your comment.

      Yes, the application used to play videos matters a lot. At the moment, we blanket unredirect every fullscreen window. There is a blacklist for windows we don’t unredirect. That includes Firefox and Flash. Unfortunately, there isn’t really a way to detect if a fullscreen window is a game or a video, though I’ve heard there’s an EWMH spec around for that.

  3. I am running on Dell Precision M4600.
    CPU: Intel(R) Core(TM) i7-2820QM
    Graphics card: GF108GLM [Quadro 1000M]
    (driver 313.09)
    Do you need further information? If so, I will link you the output of lshw.

    When I have the time I will put all that information in the bug reports.

  4. I ran a series of tests in order to reproduce and get more information about the bugs. All results were verified by reinstalling ubuntu.
    I did not open new bug reports (but added the information to existing bugs, as you asked me), if you wish me to open new bug reports, please contact me.
    NOTE: Tested drivers: (NVIDIA 304.43, 310.14, 313.09 and latest Nouveau driver included in Raring)

    Compiz bugs > 0.9.9
    Those problems are driver independent (tested for nouveau and nvidia proprietary) and are true for both, official compiz 0.9.9 version and compiz-experimental-ppa:

    Problem 1: bug 1037164

    Another problem shown in video 1 is the following (But I added it in my description to bug 1037164):

    Problem 2: bug 1093757

    Compiz bugs compiz-experimental-ppa
    Those problems are driver independent (tested for nouveau and nvidia proprietary) and are only true for compiz-experimental-ppa:

    Problem 3: bug 1093767

    Problem 4: (Overlayscrollbar is shown in workspace switcher, after a scrollable dash lens was opened)

    Another bug I have found: Watching videos fullscreen does not always work, sometimes the fullscreen is not in foreground, meaning panel and launcher do not disappear:

    Compiz bugs compiz-experimental-ppa NVIDIA
    Those problems only happen using compiz-experimental-ppa and nvidia drivers:

    Yet another bug I ran into: if log in one of the tty’s (e.g.: Ctrl + Alt + F1) run a command so that the screen gets filled up and switch back (e.g. Ctrl + Alt + F7) then the panel looks like this:

    Compiz bugs NVIDIA general
    This problems occurs under all tested nvidia drivers:

    This is maybe bug729979:
    The minimize animation beaves different for nvidia drivers than for nouveau. (This is the main issue, nvidia animation is way faster)
    The minimize animation is ALWAYS working if the window was just snapped to fullscreen:
    The video shown in the last description shows black areas, which is only true for 313.09 drivers:
    The bug is not easy to reproduce, after some time the animation started kind of working (not always and not 100% correctly, but much better than shown in the 2 videos before). How is this possible? Could this be due to cache entries etc.? I ran “cd ~ && find | grep compiz | xargs rm -rf” (to remove all cache configs etc.) but was not able to reproduce the behaviour shown in
    NOTE: The last sentence describes the experience of one (freshly installed) setup. On the second freshly installed setup (same hardware) the animation did not suddenly start working yet. If so I will report it.

    Compiz bugs NVIDIA 313.09 drivers:
    Those problems are only true for nvidia 313.09 drivers using compiz-experimental-ppa:

    Problem 5: (Addition: Blur works after logging in and out, until Dash, HUD or Alt-F2 dialog is accessed)

    Problem 6: (Closing windows sometimes shows black areas)

    see “Compiz bugs NVIDIA general” (minimize animation gets black)

    1. I’ve fixed the problems you identified (uploading to the ppa now), except for:
      * black areas: since this happens on both versions, I suspect this is a driver change related to how destroyed pixmaps are handled. I’ll have a chat to the NVIDIA developers about it.
      * overlay scrollbar: I had a look at the code and it seems like the overlay scrollbar stuff is doing things it shouldn’t be doing. I’ll talk to the author about it.
      * minimize animation: its not really related to the changes I’m making here.

      I fixed the other two problems you identified (#1 and #2), but they weren’t really related. I was looking for regressions particularly with these changes, but I fixed those two anyways since they were relatively straightforward.

  5. Awesome work, thank you!
    I have already tested the ppa, but unfortunately there are a few issues left:

    One Issue left from Problem 2 (bug 1037164):
    Here the behavior changed a little bit:
    After maximizing, the window does not get placed correctly in the upper corners by using the shortcut. (e.g. KP9). (This is all that is left of the problem.)

    This problem only happens using compiz-experimental-ppa and nvidia drivers (all tested versions):
    Logging in one of the tty’s (e.g.: Ctrl + Alt + F1) run a command so that the screen gets filled up (for example find) and switch back (e.g. Ctrl + Alt + F7) then the panel and launcher look like this:
    (Sometimes it is not even necessary to log in or run a command)
    This is one of the regressions you are looking for, as I was not able to reproduce it on other compiz builds.

    The blur issue for 313.09 drivers still remains. (I mention this, because you did not include it in you exception list in your comment, but I think you are aware).

    Bug 1095688:
    The bug despription is not correct, this part of the issue is already fixed, I will correct this.
    As for the terminal issue, I will post a detailed comment in this bug report. (It is not only the spacing issue, using shortcuts ((1,2,3,7,8,9) each time after using (KP5) the windows do not get placed correctly at all (which is definitely a bug, so please wait with “Wont fix”), however, using maximize button it works (with the spacings of course).

    As for the minimize problem: I mentioned it, because it is assigned to you on the launchpad page.
    If you start fixing one day, please remember one important thing: The animation works as expected if you maximize the window before (After minimizing you have to maximize again, although it is already maximized, in order to get the animation working). I will add this in the bug description.

  6. Hi, Sam. In your upper interpretation of the usage of one back buffer age, you suppose one window can divided into four sections, and we can render these sections through the age queried by specific frames. That is said, we can query the age of the back buffer before render one frame, and doing the render based on the age we get. cumulative
    In your example, each render result can cumulative, and here I remain a puzzled, that does the buffer (back buffer/front buffer) of this window also be divided into four sections at the first? if like so, we cannot get the whole window till finish the four frames of all, it sounds incomprehensible. Or does there exists other implements? Greatly appreciated for your introduction.

    1. I think the example he provided is simplified to try to illustrate how things work. My understanding of how it works in more detail is as follows (although not sure I am 100% correct on this):

      – Compiz tracks which regions of the screen have changed between each buffer swap. For example, it knows arbitrary rectangular regions Q, R, S have changed between frame 1 and frame 2. Nothing changed between frame 2 and frame 3. Region T changed between frame 3 and frame 4.
      (Note I am just labelling the regions so it is easy for me to illustrate–the regions are really just stored as rectangles bounding the area that has changed).

      – Now, say the back buffer age is 3 and we are trying to draw frame 5. Also, say Region U has changed between frame 4 and 5. We know that the back buffer has all the changes up to the end of frame 3 (since its age is 3). That means to update this back buffer to be frame 5, we have to modify it to have the changes from frame 3 to 4 (which means we have to overwrite region T with region T from our framebuffer) and from 4 to 5 (which means we have to overwrite region U with region U from our framebuffer).

      And of course, if some of the regions overlap, we can optimize this by splitting/combining the rectangles appropriately.

      If you have worked with software, you can almost thinks of it as having a piece of source code of a certain age, and then applying diffs to it to get it up to the current version.

  7. Awesome improvement. I think I am more sensitive than most when graphics are just not quite as fluid as they should be. This has been the case for compiz for quite a while, and has turned me off from using Linux. I have to say that this change makes a remarkable difference. Moving windows around, the expose effect (Win-W) feels very responsive and fluid. Thank you!!

  8. Oh my goodness! Amazing article dude! Thank you so much, However I am going through difficulties with your RSS.
    I don’t understand why I am unable to join it. Is there anyone else having identical RSS issues? Anyone that knows the answer can you kindly respond? Thanks!!

  9. Well, Shucks,
    I’m glad Someone finally experienced SOME rational explanation. I’d prefer to state
    that I feel significantly better now, but I’m nevertheless ill. A lot of people are unwell, often, and respiratory difficulties are turning into more and more lethal. I look to obtain dumber through the day. Are you currently doing this for the cash? Would be the funds likely to assist your grandchildren breathe? Did they promise you security? Did you think them?

  10. Hi there! This is kind of off topic but I need some guidance from
    an established blog. Is it very hard to set up
    your own blog? I’m not very techincal but I can figure things out pretty fast. I’m
    thinking about making my own but I’m not sure where to begin. Do you have any tips or suggestions? Thanks

  11. The fast food business is a fast growing and very challenging means of entering the market.
    There are two ways to go about this, firstly, change your diet and
    secondly use some natural products that help adjust your sugar levels without
    the side effects. Keeping this craze in mind companies are opening outlets
    in different parts of the world.

  12. No matter what your weight might be, everybody has wished to lose weight we
    need to clarify that before we wrap things up. Now that the calories
    you consume come from healthy foods. A commitment to getting fit
    is all you need to do is STOP eating the foods that we all
    love to eat. The crash Dieting 2 Days A Week that I want to
    talk to your doctor before beginning a strict diet. For those who
    are significant obese and are unable to facilitate weight loss even after
    opting for Dieting 2 Days A Week and exercise.

  13. I’m not sure the place you’re getting your info, however great
    topic. I needs to spend some time studying much more or figuring
    out more. Thanks for wonderful information I used to be in search of this information for my mission.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s