Performance and Framerate

At this point you’ve got a giant map chock full of crazy designs and clever enemy setups, the whole thing peaks at 12fps, and you don’t know why. Now what?

In Quake 3, keeping map performance up was pretty simple: if r_speeds was too high, cut back on the detail or find out what’s wrong with vis, maybe a clever hint brush here or there, and that’s it. The Doom engine is much more complicated, and with the myriad of new toys it gives developers comes a dazzling new array of ways to make the game run slow if you’re not careful. The trick comes in identifying which of these new tools is causing the problem. This page provides an overview of how framerate is impacted by lighting, shadows, draw calls, portalling, game frame time, and memory limits.

New Console Commands

For Quake4 we added a number of new debugging tools to help the wayward designer identify specific issues in low framerate areas.

com_limits – set to 1 to enable This is a simple catch-all we put in as a way of quickly picking out trouble spots. It’s hard-coded with the limits we used on Quake4 for triangle count, sound memory, texture memory, and active AI. With com_limits on, as you run through the game a box will appear on the screen whenever a certain limit is exceeded, showing you how far over the limit you’ve gone in the spot you’re standing.



g_showdebughud – values range from 1 to 11 This will replace the HUD with specific readouts about sound, networking, physics, and so on, depending on the value you set. (Note: You will need to download the DebugHud available on this site, since the debug guis weren’t included with the shipped game.) The value we’re concerned with is 5.



On the left is a readout of timing information in milliseconds.



These three meters are indispensable for determining what is causing low framerates, to help you avoid having to cut detail or lights if your problem is caused by something like the physics engine.


On the right is a readout of render data being sent to the card.

Lighting

Lighting is the major statistic to keep an eye on, which many are already familiar with from Doom3 mapping. For the uninitiated, the Doom3 engine makes several separate render passes for each light source. This means geometry is redrawn for every light volume touching it. Therefore, an ideal way to maximize performance is to get the brightest light on a surface with as few overlapping volumes as possible.

To see how many passes are being made and where, set r_showlightcount to 1.



Colored regions indicate the number of passes. Red is 1, green is 2, blue is 3, cyan is 4, magenta is 5, and white is 6+. If you’re seeing a lot of white, you need to relight that area.

(You can, technically, use r_lightcount in Radiant by typing it into the editor console. Don’t expect it to be accurate, because it’s only lighting against brushes and not compiled/portalled geometry, but it can give you a very rough sense of how you’re doing if you don’t want to wait for a compile.)

There are a couple of ways to stay as close to red and green as possible. The first, and the one that’ll be the most useful to you, is to simply be reserved with your lighting. Always be aware of which of your light volumes overlap as you’re lighting a room. Don’t be afraid to light until the room looks good and is as bright as you want it, but don’t go overboard, and check r_lightcount often.

As you move around your map with lightcount on, you’ll notice that sometimes the borders between colors will match your geometry, and sometimes you’ll get straight horizontal and vertical lines that slide around as you move. This is caused by the way light volumes are scissored on screen.

A light will never create overdraw on a face it doesn’t intersect, but if a light volume touches a face that extends much farther than the volume itself, the overdraw is limited by simply cropping it to a box the size of the volume’s screen bounds (the same way visportal scissoring works – for more info on this see the related page in the doom3 section of iddevnet). A good way to see light volumes in game is to set r_showLights to 2 or 3, which will draw a translucent box around all light volumes. It can sometimes make it difficult to see much of use through the “fog” of light boxes, but moving around for a better angle usually helps. Shadowcasting lights show up as blue, and non-shadowcasting lights show up as red.

Quickly load up one of the MP maps and find an area with a row of small lights. With lightcount enabled, you should be able to look at the row of lights from an extreme angle and make their screen bounds overlap, thus producing a block of white even though the volumes don’t overlap in space, and the geometry isn’t touched by more than one of those little lights at a time. This happens because the scissoring is done around the bounds of the volume, not the affected geometry. This will happen as well with larger lights, and can be easily reproduced by standing far away from a light volume that extends very far into the floor. The light’s scissor bounds will extend all the way down to where the bottom of the light volume would be, producing a block of overdraw on the entire surface of the floor coming towards you that isn’t actually being lit. You can view these scissor outlines in game by setting r_showLightScissors to 1.

To minimize overdraw in these cases, you can split your brushes along the edges of light volumes, to ensure that the faces those volumes touch are not much bigger than the light volume itself. To keep the compiler from recombining these faces you can either separate them with architectural detail like thin strips of trim, shift the textures on alternating brushes by some unnoticeable amount (like 0.125 units), or force a split with visportals (not recommended.) These will all of course add to your tri count, as well as the number of draws you’re pushing, so it’s up to you to look on a scene by scene basis to see which light volumes you can trim brushwork around and which ones you’re better off leaving alone.

The reverse is likewise true. When placing lights to begin with it’s a good idea to stick to a larger grid size and make the edges of the light volume match your brushwork wherever possible.

Another way to keep lightcounts low is to only allow yourself an extra pass if you’re going to get a significant amount of illumination on the affected surfaces. Almost all the light shaders in Doom3 and Q4 have a falloff produced by the images they use, meaning they’re at their brightest at the center of the volume and drop off to zero at or before the edge of the volume.

If you have a large light volume that’s only pushing 8 or 16 units deep into a wall, you’re probably getting little to no visible light on that surface. In these cases you would be better off shrinking the volume just enough that it’s tangent to the wall without clipping through (so that the pink line is just visible z-fighting with the surface).

Pick the whitest light shaders you can visually get away with. If you need a full volume of light, like for sunlight coming through a large ceiling opening, rav_square_bevel will give you a big bright volume that only darkens near the very edges. Falloff works vertically as well, so if your floors and ceilings seem unusually dark try stretching the light volumes vertically to make them taller, or switch to a shader like rav_spot_long or rav_spot_nofall, which are intentionally “thick” on the vertical axis.

Quake4 has a lot of the aforementioned small light volumes, which we nicknamed chiclets. They’re a great way to add some color and highlights to a scene, they’re evocative of Quake2, and they don’t produce a lot of overdraw. They do, however, add to the list of light volumes the game engine has to run through to calculate interactions, so lots of them do eventually take their toll regardless of size. Higher end systems have no trouble here, but systems closer to the minimum spec will start to choke. Since these lights are more for effect than illumination, we added the detailLevel keyvalue as a way of instructing the renderer to skip less “important” lights.

A light’s detailLevel ranges from 0 to 10, and works with an accompanying cvar: r_lightDetailLevel. The engine will render all lights with detailLevel keyvalues set greater than r_lightDetailLevel. detailLevel defaults to 10, and r_lightDetailLevel defaults to 0 on all video architecture except NV20, where it defaults to 9. To help performance on the NV20, all of our maps have the non-crucial chiclet lights set to detailLevel 5 (which we picked arbitrarily because it’s between 0 and 9). If a light has an attached model, that model will still draw with the same color value as the light regardless of detailLevel, so even if the small pool of light is lost you’ll still at least get the glowing fixture/flare for color.

In theory this allows you to rank all your lights on a scale of 0 to 10, allowing the user to set the cvar according to taste/system performance, but we never utilized it to this degree. Instead, once you’ve finished your map, just run through and set detailLevel 5 on any lights the player can still see well without. This is especially crucial in multiplayer maps, where performance is key – you don’t want to give players with higher end systems an advantage by providing them with more illumination.

One last console command that may prove useful is r_singleLight. This will limit the render to only one light in the map, specified by the value set for this cvar. Unfortunately there's no easy way to find out what light has what number, so the only way to cycle to the light you want is to try one number at a time. (Some MP aficionados are under the impression that setting this cvar to a certain secret number depending on the map will enable vertex lighting in multiplayer, which is not the case. All they've done is figured out what number corresponds to the ambient pass in each map that has one.)

Shadows

Part of the game frontend’s responsibility in setting up a render is computation of shadow volumes. When geometry casts a shadow, that shadow is handled as new triangles added to the scene.

When assessing a performance trouble spot, after checking lightcount turn shadows off at the console by setting r_shadows to 0. You’ll see framerate increase no matter what, but if you’re seeing an unusually significant gain in performance it’s a fair bet the shadows in that scene are contributing to the slowdown.

To give you a better idea of what shadows are being cast where, enable r_showShadows.

There are a few simple ways to reduce shadowing in a scene. The first is to find all the lights in the scene that you can get away with making non-shadowcasting, and disable shadows on them. Chiclets are a very good place to start. You’ll want to make sure characters still have at least one shadow wherever they can go, but if you’ve got several fill lights in one area, taking shadows off one or two won’t be visually apparent. If you can’t easily reduce shadows by light, you can try doing it by object instead. If your scene has several func_static models in it, set noshadows wherever they won’t be missed.

This will help slice off a lot of shadow computation and extra triangles right off the bat. Another thing to watch out for in your scenery’s shadows is shadow complexity. The nickname for this at Raven was "the jailbar effect." If you have a setup where a lot of fine detail, like railings or ladders, is casting shadows across a long distance or onto a lot of complex geometry, even if you’re not adding a lot of tris to the scene the frontend has to do a lot more math intersecting these shadow volumes with geometry (characters included).

This also applies to shadows being cast at oblique angles. It’s a rare occurrence, but if you have a large light with the origin dragged far out to one side, any character in that volume will cast a shadow way down the long axis of the light. When it comes to shadows, short and simple is best.

Draws & Batch Size

Every new generation of video cards that hits the market is able to render more and more triangles per second than the generation before. The way cards are able to gain so much speed is by rendering them in parallel, taking batches of polygons and running through multiple batches at a time. Each batch, or draw, has a slight penalty in overhead, meaning that the same number of polygons can be rendered much faster in fewer, larger batches than it can in many small ones. It is similar to the difference between city and highway mileage -– you’ll get much more out of your fuel by going long distances in fifth gear than you will by driving stop and go.

With the number of polygons Quake4 puts to the screen, each scene would ideally be rendered in at most 300 batches of at least 500 triangles each. Without proper precautions on the part of the designer, however, the Doom3 engine will stray very quickly towards many small batches.

The engine will split the polygons it sends to the video driver per texture, per light, per entity, and per portal area. That means that for every light volume in the scene, a batch is sent for every group of polygons sharing a texture affected by that light volume. If the same texture appears on brushes in the world and on a func_static, even within one light volume the func_static will go to the renderer in separate batches. If a func_static with the same model keyvalue is repeated sixteen times down a hallway, each one will batch separately from the other fifteen. If you have a long, highly subdivided patch mesh with four or five chiclet lights spaced out along the curve, the curve will be split into small batches for each light.

Furthermore, all effects batch separately from each other, even those with the same .fx, and every stage in the effect goes as its own batch. GUIs also batch separately from each other and, obeying the same laws that apply to textures in the world, every windowDef in a GUI with its own image on it goes as its own batch of two polygons.

As you can see, draws add up fast.

The ideal 300 batches of 500 quickly becomes wishful thinking, and if you tool around in Quake4 with debugHud 5 you’ll notice we usually didn’t even come close. Quake style level design usually means mid-sized rooms with interesting shapes and designs, built from a varying group of textures, lit by many small- and mid-sized lights. What video cards want from Quake4 is spaces with only a few textures and a couple of big giant light volumes covering everything.

They can, however, handle much worse, and depending on video architecture you only need to really worry if you’re pushing into the quadruple digits. Thus, if your draws hover around 1000 once the shooting starts, we’d say you were still doing well.

Spotting areas with too many draws is simple enough, but to identify where in that scene all the draws are coming from we added r_showBatchSize. It works just like r_showtris, with the same effects for values of 1, 2, and 3, but the outlines will be colored based on the size of the smallest batch they’re in. It scales from pink (batch size less than ten, meaning bad) through red, orange, yellow, and stops at green (batch size greater than 500, meaning good).



This can be a hard display to read, the main problem being that lots of pink isn’t always bad and green isn’t always good. You can have a scene with a lot of pink and red batching, but if your draws are only in the 400 range you won’t really suffer for it. On the other hand, if your scene is full of decals they’ll batch individually (one for each blood splat, say), but if you crank up the subdivisions on each decal they’ll jump up to a few hundred triangles each and voila, they’re green, which doesn’t actually solve the problem.

Another way to find small batches is to set r_limitBatchSize. This will instruct the renderer to only draw batches of a size greater than whatever value you set this to. Set it to 100 and see how much of your map disappears.

There’s no simple answer to a scene that batches poorly, but once you become more familiar with how it works you’ll learn what steps to take to optimize a scene for it.

This is a list of many of the common draw-reducing solutions we used on Quake4. Often, getting your draw count down will mean making some sacrifices and cutting certain things back, but with enough caution you can sometimes bring your numbers down without sacrificing the visuals you’ve created.

Visportals

A very good explanation of visportals is already available here: http://www.iddevnet.com/doom3/visportals.php

There's only a short list of things we would add to this:

Game Frame & the CPU

All of the above covers renderer related slowdowns, but those aren't the only source of low framerates. The Doom3 renderer is very CPU-driven, such that any significant delay in the game frame time will set the renderer back as well. If debugHud5 reveals a high game frame time and a short render, it's time to pay attention to render-independent things in your scene.

Things that can delay the game frame:

Memory Limits

The last major performance point to keep in mind may not affect framerates, but have a definite impact on load times - the size your map takes up in memory.

Hit the console and type printmeminfo. This will give you a rundown on how much space in memory various bits of your map require. These numbers were regularly compared against our own in-house limits, and in some cases designers went to a great deal of trouble to keep the maps' memory footprints in check.

Tradeoffs

This document is by no means meant to give you the impression that the only thing you can do without killing framerates in Quake4 is a single textured room with no lights. The game does have to spend time doing something, and the purpose of all of the above is to make you familiar with the kinds of things to be wary of. With enough expertise you'll learn to address all of these issues in a scene at once, so that if you want to try something like a cool jailbar effect, you'll know what tradeoffs you can make to make it work.

LevelEditor Performance (last edited 2005-11-16 23:02:45 by AndrewWeldon)