Vulkan and the "Ideal" Graphics API
Khronos had put out murmurs for quite some time about unifying OpenGL and OpenGL ES into one new API, which until recently, was called "glNext". This past GDC, the wraps came off along with a new name -- Vulkan. Presumably, this choice of name is an allusion to the deity within the Roman Pantheon, and his pathological use of a single tool (his hammer) -- a nicely subtle indication that we should all consider Vulkan to be the "one tool" for all graphics solutions. But let us not also forget that the parables about the god Vulkan also suggest that for a god who carries only a hammer, everything ends up looking like nails. Will that hold true in this case as well?
For a long time, I more or less favored OpenGL for all realtime graphics support simply because it had its strength of being simple to setup and work with, all-encompassing as it included interfaces for managing common operations, had an abundance of available toolkits (e.g. GLUT), and worked on pretty much every platform out there. At the time that I held this position, DirectX was still very young and rather clunky on many levels, painful to debug, the extent to which it fell into the pitfalls of COM were rather horrible (imo), and ultimately was locked to a single platform. Sure, Glide had an interface that pretty well resembled OpenGL, but we all more or less knew that 3dfx was not long for this world. Unfortunately, OpenGL stagnated for years, while Microsoft actively invested in seeing DirectX advance and move forward while OpenGL pretty much stood still until Khronos took the reins. By then, there was a lot of catchup to play, and Microsoft had gained major clout. As things moved along GL caught up, but also started to lose a lot of its beauty. What used to be an API that could be completely orthogonal to the working environment was mired in a tangled web of vendor-specific extensions, bifurcated into different versions for certain platform categories, and moving very slowly away from unifying the industry. Instead, we saw more graphics APIs show up that meant separate ways of handling the same things on different platforms, different development environments, and different hardware categories. More divergence than ever. We have OpenGL, DirectX, Metal, three.js, WebGL, GL ES, Stage3D... each filling a niche (some larger than others), never the twain shall meet but for a few exceptions.
Over the years, though, we also saw quite clearly that GPU scaling both in terms of features and in terms of raw computing power outpaced CPUs by far, and that made the interfacing to them a limiting factor in efficiency. A lot of game developers, especially, who had a strong need to push for more, more, more through the pipe were seeing disappointment in how much of an efficiency gap there was between the PC and the more dedicated one-off kernel-level layers on consoles. So AMD figured they'd work with a few developers to stir the pot a bit with Mantle. The idea behind Mantle was to simplify, simplify, simplify. The API and the driver that interfaces through it to the hardware had to do as little as possible. The less overhead there is in the driver, the less latency there is between the software and the hardware, and in principle, the more efficiently you can utilize the GPU. Likewise, you can issue extremely long and costly command buffers to the GPU without stalling the CPU waiting for queries to come back. This was the model to which a lot of console game developers are already accustomed. Xbox notwithstanding, the Sony platforms all followed a model that fits this sort of pattern. Although the PS3 had OpenGL support, the predominant mode of access (once it matured) was the much lower-level LibGCM. Sure enough, this is what Apple's Metal API for iOS devices is doing, and also what DirectX 12 is doing. And while Mantle itself probably is not going to live too much longer in its standalone state, it has been folded in as a sort of bootstrap for the current reference implementations of Vulkan. More than anything, AMD gets to claim that the echoes of Mantle have reverberated throughout the industry. How true that actually is is debatable, but from the end consumer's perspective, it's convincing. Driver and API overhead has been a known issue for some time, and this simplicity approach appears to be the direction everybody is taking to solve it.
The flipside of this is that a simpler API layer exposes the complexity of the hardware to the developer and leaves it up to them to manage all the trappings that the driver would otherwise do. At the face of it, it may seem like this is not really helping anything because we just moved the load from the driver layer to the application layer. But where it changes things is simply the fact that it leaves it up to the app developers to decide what particular complexities they need to tackle. It also means that a lot of the safeguarding measures that are a given with a higher-level API can be stripped away from a release build. If we actually look at an OpenGL implementation, you'll see not just the work related to issuing commands to the GPU, but a huge load of rule validation, corner case protections, what-to-do-if-it-fails, etc. The drivers have to do this work both to meet the language spec, but also to prevent crashing the GPU. The driver has to be this way because it has to be a catch-all for any application -- it can't make any assumptions about the context in which a routine is being called.
Sure enough, with Vulkan, we expect the application complexity to rise. We'll have to start writing our own GPU memory managers, manage state far more explicitly than before, and there will be more specificity of control in the command buffers. The upside is that we can implement them exactly the way we want to, and optimize them for the application in question because we will have explicit knowledge about the context in which things are running (and therefore, make commensurate assumptions). It also means that because we as the application developer have more work to do, we also have a better idea what is happening and that gives us a more tangible body of information when debugging. It also won't hurt turnaround time on driver development either.
The sort of baseline material presented in the GDC talk leads down the path of a hypothetical setup in which construction of command buffers can happen in parallel with a separate thread for actually issuing those buffers to the GPU; much improved over single-threading everything that has anything to do with the GPU. Instead, we can multi-thread the creation of command buffers and issue them from a single (per GPU) master thread. This is still a little behind DX12 which actually does support multi-threaded submission through multiple command queues, and Khronos is currently debating whether they want to have more than one command queue per GPU.
All that said, the real reason I'm at all excited for Vulkan is that it brings all of this to a unified API. Direct3D is Windows-only. Metal is Apple platforms-only. Mantle is AMD hardware only. Vulkan is hardware-agnostic, platform-agnostic, language-agnostic (to the extent that bindings to any language could easily be written), and class-agnostic. By ripping away complexity, they've also reduced the API to the lowest common denominator and left the developer to figure out how to work around the hardware's capabilities. In addition to ripping away complexity from the driver, Khronos have also ripped away GLSL. Vulkan runs on an LLVM-like shader backend called SPIR-V, and shaders in any old language (or new one if you prefer) need be compiled to SPIR-V. It does mean that developers can use whatever they like. There's a play here for an easier path to compute shaders as well because it's the same SPIR-V whether we're talking vertex and fragment shaders as it is for compute shaders. If you can write a SPIR-V compiler for RSL/OSL/MetaSL (or compile to an intermediate language which already has a SPIR-V compiler).. you've got a free path to GPU accelerated shading, and that's not a bad thing at all.
Well, that's not to say that the old ways are dead. OpenGL and GL ES aren't going away any time soon. This model is an easy sell for game developers, and developers of apps that will strongly benefit from the extra performance. For these people, performance is everything and then some. Game devs, in particular, are even willing to put up with loss of cross-compatibility and having to maintain multiple versions of multiple codebases for the same project just to push that few percentage points more polygons down the pipe. Vulkan/Mantle/DX12 all present an API where leveraging more parallelism here and there buys them a lot more than some trifling gains (which was the case with DX11 and GL4.3). At the same time, it puts them in the driver's seat (pun intended). There are still places where that is not an easy sell. For instance, CAD/CAE developers aren't so interested in having performance at the expense of robustness. Sure, you can make a Vulkan-based app extremely robust, but it's now your own responsibility. For these people, reliability is job 1, and OpenGL gives them a certain baseline guarantee because of the extent to which it is weighed down by rule verification and error-checking. But then, Vulkan isn't made for them anyway. Nobody at SDRC is worried about driver overhead... as far as I know, anyway.
Is the Vulkan model really the best answer for GPU-based anything? Well, for the types of things that demand performance first, I'd say yes... though I'll qualify that by saying that it's only the best answer we can put forth at this time given the climate we're in currently. It would be nice if we could reach a point where these types of bottlenecks could exist as they are but still be insignificant overall, but that can't really happen. The barriers here aren't strictly technical, either; even physics itself is against us here. So as long as that's the case, anything that helps keep the GPU more well-fed is hugely valuable and worth any price. Down the line, I don't think this will necessarily remain the best possible solution. I'm still keeping my fingers crossed for the day that MCRT supplants rasterization in the realtime space. It's already happened/happening in the offline CG realm, but when that happens, the very idea of the GPU will have to be quite different. For now, having a single API for all the platforms you'd like to use is great in principle, but we'll see how happily the transition comes. All said, I'm still happy to see it. Details are still loose with only a preliminary SPIR-V spec (in which a lot of vital information is still marked TBD) out in the open and nothing in detail on Vulkan itself. But when those details are out, my sleeves will be rolled up.