I’ve used the term “unified” rendering pipeline a few times throughout the website so I figured I should write a little about it here. I’ve had quite a while to work with this type of architecture on Zombpocalypse and I hope to pass on a little of that knowledge.
History Lesson
Unified rendering pipelines aren’t exactly a new thing. In fact, they’ve been around for a very long time. Before shaders existed, fixed function pipelines were the standard way to rasterize triangles. We didn’t have the option of animating characters on the GPU or performing complicated vertex and fragment operations. We weren’t even able to create smooth skinned characters and had to resort to articulated animations, this meant that the final geometry sent to the games’ rendering manager was often in world space before any part of the rendering code got access to it. Some crafty matrix stack operations could be used to avoid vertex transformations on the CPU but this was fairly primitive and didn’t give nearly the depth of control that we get from today’s modern shading pipeline.
When programmable vertex shaders came onto the scene, one of the first useful things that was shown was how we can now animate and render smooth skinned characters through GPU assistance. This was a huge advantage since the trend around this time (late 90′s, early 2000′s maybe) was to have a PC with a modest CPU and goliath video card. There was an inbalance of power at the time and GPU’s had horsepower to spare while CPU’s were sufficating under the rise of complex AI and physics in games. The trend of rendering engines at this time was still very simple; characters were rendered based on a lighting approximation using just an ambient term and a directional vector, and the environments were still heavily dependent on light maps. Because geometry only needed to be rendered once per frame, the method of dynamically transforming a character model on a clearly superior piece of hardware wasn’t even up for debate. The disadvantages of this approach started to creep up in the newer generation of graphics, where every pixel on the screen was a result of a series of complex operations.
Traditional Pipeline on Modern Engines
What I am calling the “traditional” pipeline is one that involves maximizing the GPU at all cost and minimizing CPU intervention within the rendering system. This approach often requires an array of unique shaders that explode into countless permutations based on their input parameters. The idea here is to have a shader that perfectly maps to your needs for each surface type (or material) you have in your scene. In doing this, you avoid even 1 extra clock cycle by performing the exact number of operations required to properly render a surface. Now, just because something is rendered properly does not mean it is rendered accurately. Traditional pipelines are designed for speed, not accuracy; they often still resort to light maps to render the environment and voxel-based lighting approximations for characters. As a result, characters can sometimes appear too dark or too bright in certain circumstances unless a strong ambient term is used. Using a strong ambient term will wash out the colors in the scene unfortunately so it’s best to be avoided. Lighting is actually not the straw that breaks this camel, it’s the absence of light, our shadows. Traditional pipelines often rely on dim shadows; another approximation based on the direction of the nearest (or most intense) light source.
Just because this technique uses a hefty amount of approximations does not necessarily mean that it’s final result is going to be ugly. Some of the most popular games of the year are still relying on this approach because of it’s superior ability to render large numbers of characters and highly detailed environments. Here’s a list of a few games you might not have realized that still rely on traditional approaches of light maps and/or lighting approximations for dynamic entities.
- Half Life 2
- Gears of War 1 & 2
- Call of Duty: Modern Warfare
- …the list goes on
Advantages
- Maximizing GPU (on what it was designed to do)
- Render more dense crowds of dynamic objects
- Pre-compute complex lighting for lightning fast environment rendering
- Supported by much more modest PC rigs
Disadvantages
- Visuals are a compilation of multiple lighting approximations which cause artifacts
- Multi-pass algorithms require dynamic geometry to be re-animated per pass on the GPU
- Without a distributed solution, pre-computing light maps may take up to hours to build
- Light map texture coordinates may cause unnatural breaks in the geometry if not properly mapped
Unified Pipeline on Modern Engines
It is true that using the traditional approach of custom shaders for every type of object you are rendering is still in use today. It may likely even produce results at a much higher performance than with a unified pipeline. The goal of a unified pipeline, however, is not necessarily speed but accuracy. When an object is rendered with this approach, it is sent down a single code path. This means that every triangle in a visible scene will receive the identical treatment, this includes both lighting and shadowing of that triangle. The end result is an image of near perfect quality, down to every pixel on the screen. Characters will move seamlessly in and out of shadows and will be accurately lit, even if the character is only partially in cover. The cost of this accuracy is the burden placed on the CPU which is responsible for preparing the data to be sent down this single code path. Characters will need to be transformed to their world space representation which can be costly for multi-weighted skinned meshes. One of the primary benefits of this approach become apparent when you want to render shadows. If you want to render shadow maps, you have nice clean world space geometry; no need to re-animate a character on the GPU using a special shader just to obtain it’s depth information. If you want to render shadows using stencil volumes, you have a single code path that is designed to accept a world space object and produce volumes from that object. This same code is used for both environment and dynamic characters.

Entities are animated on the CPU and treated no different than environment geometry.
Using a uniform pipeline is not all gum drops and candy canes, there are some serious setbacks that go beyond the CPU limitations. To perform accurate lighting and shadows each light must be rendered independently against each surface in the scene. The results are accumulated per pixel since multiple lights could easily be overlapping several pixels on the screen. In addition to a powerful CPU, you’ll need an even more power video card to handle the shear number of instructions to be executed on each pixel of the screen. Despite this setback I really fell in love with this approach; it’s the closest real-time implementation I’ve seen to an offline rendering system. Sadly this approach didn’t appear to build much steam as the only games I know of to use this are idTech4 based games. Even idTech5 seems to be reverting back to some of the traditional techniques, though not much is known about this tech at the moment.
Advantages
- Lighting is not an approximation; it is accurate down to each screen pixel
- Multi-pass algorithms can be faster since the geometry is animated only once per frame
- Exporting and compiling environments is very fast (no light pre-calculations)
- Does not suffer from unnatural breaks in geometry, meaning a reduction in vertex counts
- Cached models can be kept across multiple frames. Non-animating characters will not re-compute their caches. This means that corpses, or other inanimate objects will render as fast as environment geometry until the moment they being to move around.
- A single shader (vertex & fragment) can be used to render the entire scene uniformly
Disadvantages
- CPU and GPU share rendering responsibility
- Rendering performance may be limited by an already starved CPU (thankfully multi-core will help)
- More than 2 or 3 overlapping lights means dozens of fragment operations per pixel on screen
- Non-gaming PC’s may likely cry for the sweet release of death under the strain of this technique
- Additional memory is required to store cached world-space models for each entity on the map (or at least the visible scene)
Let’s Wrap this Up
So in the end, it really depends on your focus. The traditional pipeline is still the most flexible, even if it is the clumsiest approach, because it gives the developer the option to balance the use of GPU or CPU based algorithms. The uniform pipeline is often the best if you are trying to produce a high quality image with the least amount of code and shader permutations.
