On the future of Graphics Drivers

Hi,

I’m sorry I haven’t blogged in a long time. This needs to be discussed in a separate post, however I wasn’t in a good headspace for the last 7 or so months. That’s changed now, and a particular issue came to my mind that I feel deserved a blog post.

Ubuntu 12.04, Compiz 0.9.8 and Unity 5.12 are fantastic. My colleagues, Daniel and Alan and I have invested a significant amount into refactoring the compiz code so that we can split it up and test it, writing test coverage for most of the fixes that are going in, improving performance through profile guided optimization and fixing some of the most important architectural issues within compiz. We’ve streamlined the development process by merging together a large number of branches into one source repository, moved to test-driven-development practices with a stringent review process.

This has made Ubuntu 12.04 fantastic. Fantastic it seems, unless you are using the proprietary NVIDIA graphics driver.

Now I don’t want to make this a blog post hating on the NVIDIA driver, or their authors or on NVIDIA themselves. It has excellent support for on-GPU video decoding, GPGPU operations, OpenGL 4+ and so on and so forth. Recently it received support for XRandR 1.3. This should be applauded.

That being said, I believe the continued use of both the NVIDIA and FGLRX drivers within the Linux Desktop community is now considered harmful for a number of reasons. And now that we have realistic free software drivers to replace them such as nouveau and radeon, the free software community needs to reconsider its position in its support for these drivers.

The first problem is obvious, in that we are perpetuating a norm that providing proprietary software as a means to bootstrap free software is acceptable. Ethically, many people in the free software community can see why it is important that software is free (as in freedom). More importantly, hardware drivers are very large and control a significant piece of the stack used to make your system work. The command schedulers and display modesetting code amongst other things run as superuser and there is no way to see or change what they might be doing to your system. I don’t suspect that NVIDIA nor AMD are interested in deploying malicious software on their users, however, they can still inadvertently enable that – recently a security hole was fixed in the NVIDIA driver which allowed attackers to read and write arbitrary memory on your system. Because the drivers are closed, the free software community is powerless to do anything but rely on NVIDIA or AMD to ensure that we run systems that are secure.

However, my name is not Richard Stallman, nor am I a member of the FSF, and as such this isn’t the most important part of my argument.

The most important part of my argument is this:

The existence of different and proprietary implementations of OpenGL promote a culture whereby we don’t engage with problems directly, but we corner case particular drivers, hack around others and create a sub-par system for everyone.

If you’ve ever written code with me you’ll understand one thing – I hate writing hacks. Hacks are a short term solution which make a long term problem worse. Hacks demonstrate that you haven’t given the problem your full attention because you don’t understand what the problem is. Hacks are like trying to shove the incorrect puzzle piece into the puzzle. If you build your reality based on that, you’ll end up with something that’s very fuzzy, not rigid and crumbles very easily.

Writing against proprietary drivers requires writing hacks, simply because there is a point where you can research no further into what the problem really is. For example, in one graphics driver, we found that changing the name of the program was necessary to ensure the driver used direct rendering. On other drivers, I’ve found that all future texture binding will fail silently if you release a pixmap on the server before releasing a texture on the driver and flushing the pipeline. Or on some drivers, if you don’t re-bind an offscreen pixmap before binding it to a texture unit, the driver will never flush changes to that pixmap. Or on some drivers if you resume from suspend, you lose framebuffer object contents. Or on other drivers, if you use glPushAttrib with GL_CURRENT_BIT, the previous values on the attributes stack don’t get carried over.

That isn’t even an exhaustive list, and they’ve all been the result of lots of engineer time wasted over hunting down bugs they’re not allowed to see and trying to guess what’s going on so that workarounds or fixes can be applied.

Now, while I’m actually on leave for study, we’re at it again – the NVIDIA driver has removed support for GLX_MESA_copy_sub_buffer and disabled Sync to VBlank by default, which, amongst other reasons is making rendering really slow on precise. In addition to that, we’re fighting lots of strange quirks where texture binding is randomly failing.

The largest problem is that while I own an nvidia card for years – I’ve not run the nvidia driver in at least two. I’ve been using nouveau since early 2010. And now every time I do any work which might affect the graphics pipeline, I and every other developer have to make sure it works on nvidia and fglrx too, because it could just break without our knowledge.

The free software drivers on the other hand, share the same libGL. The paradigm shift here is huge. Instead of make a libGL that works on Windows for our driver and invest a bit of time making it work on Linux and Mac OS, it is make a libGL for Linux and make it provide a sensible abstraction layer to use many different kinds of hardware. And the results pay off. When I write code that works on nouveau, I can be damn sure it will also work on intel and radeon too, because the OpenGL implementation isthe same. And when it doesn’t work, we know that the problem isn’t in the OpenGL implementation, and we can drill right down into that driver and fix it in the right place.

In the future, when we move past X11 into a model where the drivers use provide the same direct rendering APIs from the kernel to userspace, this is going to become even more important. Already we see that NVIDIA and FGLRX are not going to support EGL, KMS or Wayland.

I think its now time we ask ourselves this question: Do we want to, in the name of short term gains in performance and higher level OpenGL support, lower levels of the free software world, which have the biggest advances to make, back in order to support proprietary drivers, or do we want to eliminate our dependence on them and set ourselves free. I think that it is now becoming clearer that the latter is more and more important to the community. As such, I believe that as a community, we need to be taking greater steps to supporting free software drivers, as they are the future.

About these ads

36 thoughts on “On the future of Graphics Drivers

  1. Thank you for these argumentation. I have deal with nvidia bugs until today but I’m switching to nouveau right now.

  2. Of course you have your reasons as a developer.
    But the the free software world has been a mix of doing things that people use and coding exercise for developers.
    Unity/Gnome 3 for instance completely disregarded the then current users for their products. nouveau, I hear, doesn’t work with the FX series.
    So of course a completely free world is the ideal, but for that to happen all have to freely choose to be responsible to some standards that the free software community hasn’t reached yet.

  3. I strongly agree.
    But, unfortunately, currently we can’t discard either FGLRX or NVIDIA because of new hardware support, power-management and other critical features.
    (actually I heard that radeon devs target “0 day support” in a not so far future…)

    Who wants to setup a kickstarter to fund something like “decent power-management for radeon R500+” or similar? I would surely pay for that.

    1. I’d definitely be behind this. I don’t know exactly what the manpower requirements are in the nouveau and radeon projects (remember, more manpower or more money doesn’t exactly give you better code, it might just be that time is what’s required).

      I suspect if you ever wanted to do something like this, it would need to be in careful consultation with the developers behind those projects.

  4. “And now that we have realistic free software drivers to replace them such as nouveau”

    Dude, *if only*.

    Nouveau is most certainly NOT a “realistic” replacement. Not on the Thinkpad W510, not on two different desktops using a standard Twin Frozer 560Ti… not on a lot of systems, actually.

    Unless you consider not having power management, overheating laptops until they lock up or reboot, draining the battery at almost double the rate, and crashing the entirety of X at random intervals (killing your open programs and bringing you to a new LightDM session) as a “realistic replacement”.

    Sure, it using it addresses strange, edge-case visual bugs… but at the cost of a stable system. The open ati driver has, for *years*, been much better overall than Catalyst, and the Intel drivers have always struck me as smooth and stable. And you bet your sweet bippy that if Nouveau does, one day, become stable enough to replace the binary blob, I’ll flip over in a New York minue. But right now? Right now, whenever anyone *seriously* suggests “just run Nouveau” as if it were an *actual solution* to GL problems in Linux, I just want to reach through my monitor, grab the speaker by the lips, and ask them what planet they are from.

    Because as of right now (just *yesterday*, in fact)? I fixed a system suffering horrible freezes, resets, video glitches, and constant overheats by installing the binary blob.

    (On a side note: I don’t seem to suffer poor rendering times in Unity using the “nvidia” driver. I haven’t seen the “white window” bug in a while, though that’s probably because I’m using cards with lots of memory, but I know that exists.)

    1. Sure, I don’t think “everyone should switch to nouveau or radeon right now”. That ‘s certainly not workable for everyone, and to be honest, I don’t think it ever will be workable for everyone. Likewise, “just use nvidia” or “just use fglrx” is also not workable for everybody either.

      Incomplete power management sucks. I know, because I have to deal with a very noisy laptop ;-). And I am sure that when I build my replacement desktop in the next year or two, I’m going to be dealing with a very noisy radeon card.

      In the broader sense however, I do think that these drivers are now at a workable state where we should be carefully looking as a community about where we want to throw our support. In that sense, I would say that I don’t think we should be investing substantial time and energy in supporting the vendor provided drivers, and I would disagree, for example, with the continued recommendation that users install the nvidia or fglrx drivers “for support” in Ubuntu. I also don’t think that we should say “well, the nvidia or fglrx driver doesn’t support X or has this bug Y, so we can’t and shouldn’t do Z”.

      I guess its more of a “here is why we should change our policy” thing.

      1. I see where you’re going, but I just can’t agree. Or, rather, I can’t agree that now is the time to shift policy.

        Using Nouveau for a long-enough period of time could literally destroy hardware (due to heat). And if you ever want to get the gaming performance out of the gaming card you might have, there is no real choice here.

        I think the community *has to* spend substantial time and energy supporting the vendor drivers *until* things like that are worked out. I think it would be a horrible mistake for the community to say “well, if you want to run Unity without defects or freezes, you have to use this driver that might melt your lap, but on the upside: Common GL layer!”.

        (Slightly different topic: I wanted to say in my response above how fantastic Comiz and Unity are in 12.04. For a guy who, a little while ago, wasn’t sure he was the man for the job, you sure did pull out all the stops. This is seriously the most stable and slick Compiz I’ve used in a long, *long* time, and I say that from behind the nVidia binary blob.

  5. Fair point, but pure-OSS software does have a fair bit of problems. Just look at the audio stack which is not hindered by proprietary factors – PulseAudio is still not an acceptable solution (it frequently dies for me while I’m gaming and I have to restart the game and PulseAudio to get sound back).

    It’s been several years since PA has been out, as well.

  6. Of fifteen different desktops running NVIDIA GPUs, the situation with Open Source drivers is dire. NEVER have I yet managed to get NOUVEAU to work. I have always found a binary blob that works, be it sometimes requiring much trial and error correction. Version changes in Xorg have often been deleterious to continued harmonious functioning of a specific driver blob.

  7. Some years ago I switched to intel graphics hardware because of the reasons you mentioned in your post. To my mind the overall desktop experience is simply better.
    Also I think the proprietary solutions start to hinder innovation. For example the developers can’t just only concentrate on wayland but have to find out how to keep the »system« running with »legacy« proprietary drivers. This is a wast of time and resources.

  8. I follow developments in Compiz, but I have never had a desktop that was enough of a ‘games’ machine to even _support_ Compiz at any level. My applications don’t require that level of graphics performance, and I have been forced to drop desktop environments which require Compiz or similar, including KDE 4+, GNOME 3+, and Unity. I believe one thing which should be stated in the hardware requirements list would be the _exact_supported_list_ of graphics hardware and/or drivers for a particular distribution, at least as an errata somewhere to enable new users and even more experienced users to avoid ‘bling’ if they don’t want to budget power and money for those environments. Granted, if you *already* are playing FPS or similar games, you can indulge the desktop environments which have large budgets for graphics engines.

    Of course my suggestion is *at least* as idealistic as your suggestions on supporting only Open Source drivers, because nVidia, ATI, nor Intel would wish to give away proprietary hardware information to allow really good Open Source drivers.

  9. Ok, so lets look at this a little differently (and realistically from the “Users point of view”).

    1- Users who purchase fast computers and mid to high level graphics card, do it because of need for graphics performance (else why not just get stick with the oem intel card…).

    Clearly, they need the binary blobs. The open drivers are not (and may never be a viable solution)

    2- linux / Ubuntu is becoming a “game friendly platform” (lots of indies, kickstarters, soon Steam/source, unity3d and probably EA). So again the “open drivers” are not a viable solution.

    3- You mentioned wayland, etc. : , “Already we see that NVIDIA and FGLRX are not going to support EGL, KMS or Wayland…”

    That doesn’t mean it *Wont* happen. How can you be 100% certain of that? Linux is not as small in influential partners as before. So if we can get Google to use Wayland, Canonical (and other companies, even game developers) plus the Linux users to place the necessary pressure on NVIDIA / FGLRX, they will start providing the support.

    Like they say just “SHOW EM THE ROAD TO THE MONEY!”.

    Linux users are voting with their WALLETS now. and is why we are getting games, not just because all the developers are good Samaritans. And more high end games, also means more business for NVIDIA / ATI.

    We want Linux to really become a viable replacement (not just another “third class alternative”) for windows. We build the future together!

    1. > Clearly, they need the binary blobs. The open drivers are not (and may never be a viable solution)

      > 2- linux / Ubuntu is becoming a “game friendly platform” (lots of indies, kickstarters, soon Steam/source, unity3d and probably EA). So again the “open drivers” are not a viable solution.

      I’m curious as to why people think this. Significant work has already been done in improving the shader compiler infrastructure so that we can finally have optimized shaders. Game performance on nouveau for me is already quite fast – perhaps not as fast as the binary driver, but optimization is a matter of time, not necessarily of architectural roadmap.

      I think a large part of the problem here is the fact that DRI driver development is primarily engaged with the desktop usecase (eg, compositing) because this is the only usecase that engages with them. It really breaks my heart to see that game developers blacklist usage of the DRI drivers on the assumption that “they will never be good enough”. If game developers refocused on supporting the DRI drivers that we have now, that will inevitably change the course of development of those drivers. (Resourcing is also a problem here, however as I’ve outlined earlier, it is important to be careful with the assumption that more developers == more man months == better code).

      > 3- You mentioned wayland, etc. : , “Already we see that NVIDIA and FGLRX are not going to support EGL, KMS or Wayland…”

      > That doesn’t mean it *Wont* happen. How can you be 100% certain of that? Linux is not as small in influential partners as before. So if we can get Google to use Wayland, Canonical (and other companies, even game developers) plus the Linux users to place the necessary pressure on NVIDIA / FGLRX, they will start providing the support.

      My experience is that “enterprise pressure” really doesn’t work. It is a very big fallacy amongst the entire sector (and not just graphics drivers) that if big corporation A leans on big corporation B then B is going to do A’s bidding. Having been on both the “leaning” and “leaned on” sides of things I can assure you that this isn’t true. I’ve asked the developers of some of the drivers with broken behaviour to fix it (and the fix, I would estimate, is trivial) and they still haven’t done it. Likewise, I’ve seen instances where organizations I’ve worked for have been leaned on to do things and it was just impossible to find a part of our capacity to fulfill that.

      Supporting EGL and KMS are not trivial. The turnaround time in big organizations from when something goes from code to reality is very long because of the QA process involved. EGL support is certainly viable – NVIDIA already supports this on Android with their Tegra chipsets, however they have to bring an entirely different EGL stack to Linux on drivers that were written with GLX’s design in mind.

      Support for in-kernel modesetting is even hairier. It isn’t necessarily a prerequisite for Wayland support, but it is an existing stack that would be very nice to tie into. The options for NVIDIA and AMD are going to be either fighting the legal troubles with making their drivers work with KMS, or reimplementing KMS and GBM themselves and then pushing for a vendor neutral API with different implementations (sound familiar?).

      It’s not that its never going to happen, but rather that by the time it happens on NVIDIA and AMD’s cadence, it will already be irrelevant. It took more than five years from when XRandR 1.2/1.3 was finalized to when NVIDIA was able to support it.

      By holding ourselves to the lowest common denominator that external organizations can provide, we are holding the progress of the Linux desktop back, and Linux will never be a viable option compared to Windows, Mac OS X / iOS and Android.

  10. I am currently running an ATI card with FGLRX. I wouldn’t mind running the radeon driver, but I need the opencl support for bitcoin mining. AFAIK, the support is almost there in the open-source version so if the performance is roughly on par I’ll make the switch.

    I only buy ATI because (on top of the performance in mining) I remember ATI giving lots of docs to free software developers a while back, making it easier to write good quality open-source drivers. Is this still the case? Gotta support open-source companies if you want open-source drivers.

  11. So in essence, it’s the exact same problem as with web standards and browsers. Especially IE.
    It’s hacks all the way down in web dev. And when we get a change to get to a mentally healthy level, with XHTML, full CSS 2.1 support, and proper DOM2, along come Chrome and WhatWG, and add idiotic shit like a instable (in all meanings) webGL API and tons of other half-baked shit that only works in certain browsers, and only partially.

    Then again, this shows that it’s not just a closed-source problem. It’s just worse with closed source.

  12. Just a VERY IMPORTANT tip on how to deal with such bugs:
    DO NOT EVER write workarounds for things where the API doesn’t act as documented.
    If the driver fails at acting out your commands, LET IT FAIL!
    And make it extremely clear to the end user, that it’s the driver, and that they should GTFO of that driver.

    In essence, by adding hacks, you’re acting as a enabler. You hide the driver’s failures, thereby weakening your own argument, and letting them get away with it!

    Have a spine! Say NO. If somebody complains about bugs that are known to be the driver’s fault, *tell them*. If they continue to complain like idiots, tell them to GTFO. Never care what the loud idiots say. They are called idiots for a reason.

    Otherwise you’re in for a world of hurt and pain. I can tell you that based on a decade of web development experience. There is no way around. Face the shitstorm now, or suffer to the rest of your life.

    So in that aspect, your post is exactly right. We have to end this insanity.

  13. WOW I was wondering what happened to you! Never saw anymore a post from you! I was start to getting worried… True, you have total right! But I can’t even do some hybrid graphics modeswitch with ati(radeon) and intel. By the way, i think radeon is working very good on my laptop but definitely need more power management. We need to support more the open source graphics! Good to see you on activity again!

    Best regards,

    Celso

  14. I flipped from proprietary nVidia to Nouveau-git for a bit a while ago and was actually really impressed, but it still wasn’t stable enough, still wasn’t fast enough. It was better than I remember it being, but I do a lot of gaming and graphics development, and Nouveau just didn’t cut it there for me, it just didn’t have enough extension support, just sucked too much power, just didn’t provide fast enough Acceleration.

    I would love to switch, because of these reasons and others, but It’s not viable today, maybe in a year or two.

    BTW, fantastic work on Compiz for Ubuntu 12.04, really slick and solid.

  15. So I guess while everyone is “waiting” for NV and ATI to get their act together; probably China will come along and -make- a graphics card that actually is designed to work on bsd/linux systems and have their little security hook in it besides. Ah, can’t beat old American greed to mess up things; or better yet, our ability to trust. Compared to 20 yrs ago, looks like slowly but surely, maybe they are actually zeroing in on making it just too hard to be on linux/bsd at all. Read an article the other day about ATI drivers and bsd. Pitiful. Considering MS is wanting to put their “brand” on everything for ‘security’ reasons…I can quite easily see everything going that way and leaving linux/bsd in the dust. No card, too bad so sad.

  16. Well… this is kind of disappointing.

    I went from a generally rock-solid experience on 10.10 / 11.04 to something that could best be described as an absolute trainwreck on 12.04. I guess the NVIDIA drivers shoulder some of the blame if what I’m reading here is correct… but in the end I can’t help but feel anything other than extreme discouragement over the state of the Linux Desktop.

    For some of us, nouveau is pretty much a non-starter. If there’s going to be a disengagement from proprietary drivers in compiz then I guess those of us who use them will be forced out and have to limit our options. I can’t say that we’re entitled to support from a free project after all.

    The folks at Valve are going to be in for a world of surprises I do believe.

  17. for some reason gmail was sending notifications ofr this blog to my spam folder :\ but it now it is working again. I am glad to see you take up the blog again, Sam. It always was nice to know where compiz was at, interesting bits of information and your perspectives. glad to hear things are better.

    As far as Nvidia, i agree the blobs do need to be phased out, but i think right now is premature for that to realistically happen, for a variety of reasons; everything from nouveau not being very performant, lack of support for many features, poor power management, etc, etc… Is Nvidia performance really that bad in Ubuntu?? I don’t find it to be bad Vanilla Compiz + Arch.

    Maybe canonical/ubuntu community could put some pressure on Nvidia to fix some of the workarounds, in their driver. – this would require a little organization (guide/wiki) , as many users wouldn’t know how to report the bug properly / which bug they should be verifying (more specifically). Nvidia does provide a bug reporting tool – so it wouldn’t be that hard for even a newbie to collect that info. ~ i sort of got this idea from the maintainer of FSThost, right in his readme he has a link to a wine bug report that affects VSTs – to add yourself to, if you run into the same issue. and he also provides a workaround for now… smart stuff :)

    obviously, it’s not a solution but i’m guessing nvidia might fix some of these issues with enough noise, especially given the PR mess they have had recently.

    I know a few people who have reported bugs and gotten them fixed. Recently an RT related bug was fixed too, which i had previously been affected by, briefly.

    I think maybe 1 – 2 years Nouveau might be viable for most people, but it still may be the case (even then) that others may still require nvidia because of more GFX intense apps, OpenCL/CUDA, etc.

    I think realistically no blobs could still be 4-5years away, (or more) if AMD/Nvidia decide to support EGL, Wayland, etc…

  18. Hi, I’m curious.
    Did the things got better after Valve started to work on Steam? I’m talking about the need to write hacks to make the drivers to works correctly. Bye

  19. The latest nvidia 310.19 driver is a significant improvement for me in terms of animation speeds especially when using unity, the hide/unhide animations and the minimize animations are finally at a state that gives me an option to replace windows.There are still the odd visual bugs like glitches when resuming and the blank screen on maximize bug but it’s nearly there.

    I agree that the open source nouveau driver is too slow to be of use.

  20. Hmm is anyone else encountering problems with the images on this blog loading?
    I’m trying to find out if its a problem on my end or if it’s the blog.
    Any feed-back would be greatly appreciated.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s