Optimizing Game using Unity Draw Call Management

As a game artist, your job is to make things look good. But from the computer's perspective, your art is simply just a list of vertex and texture data. Learning how the computer actually works can help a game artist make optimized assets and even help tackle complex asset creations in a resource-friendly way. Draw calls are one of those internal workings of a game engine. In this article, you will learn what they are, how they work and why and how to manage draw calls. For the sake of simplicity and how complex the rendering process is, we will only focus on draw calls and not on the processes before and after draw calls.

Introduction

Render state basically means how meshes are rendered. It contains information like vertex shader, texture, material, pixel shader, lightning, transparency etc. Each mesh to be rendered will be rendered using this information. Multiple meshes can be rendered with the same render state value (e.g. material) if you don't change the render state before rendering the next mesh. After this, the draw call is made.

In simple words, draw calls are basically commands given by CPU to GPU in order to render a mesh. The command only points to the mesh and information like material, texture, shader etc. are already defined via render state. The mesh resides in VRAM (memory of GPU) at this point. After the command, the GPU takes the vertex data from the mesh along with the render state values and renders them on your screen.

Example

Let's take a simple example. Create 10,000 small files (1 KB each which is about 10 MB in total), then copy them from one location to another. Then create a single file of 10 MB and copy it from one location to another also. You will notice that the small files will take longer than the single file, even if they are of the same size. This is because each file carries an overhead with them, like prepare to copy, allocate memory, read and write. The overhead increases with the number of files transferred. The same concept applies to draw calls; the more the number of meshes, materials, shaders or textures you have to render, the more overhead it has and more compute time is necessary to render those assets. Too many small meshes are bad, and if they are using different material parameters in them, it's even worse.

The problem

The GPU is designed for fast rendering. Often times the GPU will render meshes much faster than the CPU can send commands. The CPU command also has its overhead like some graphics API, OS layers, drivers etc. So, if you submit few mesh vertex to render with each call, you will be CPU bound and the GPU will mostly stay idle. This creates a CPU bottleneck and the CPU won't be able to keep up with the GPU.

Another problem arises from multiple render states. What it means is, if you have different materials on different meshes, additional setup time is needed on both CPU and GPU. You set a render state for the first mesh, command to render it, then set a new render state for the second mesh and command to render it and so on. Changing render state can be expensive, so it best to avoid shader changes or material parameter changes when possible.

Multiple materials on the single mesh can also cause issues. When this happens, your mesh is ripped into pieces (one piece has one material) and then fed to the GPU. This of course creates an additional draw call per mesh piece.

The solution

Multiple meshes which use the same render state can be batched together. Batching means grouping small meshes together before calling the API to draw them. By doing so, instead of having one draw call per mesh (using the same render state), you would combine them and render them as one draw call. This means you can draw multiple meshes (chair, table or bed) at once as long as they have the same render state (same material setup). Batching are usually used on static objects (house, mountains). Dynamic objects (bullets, rockets) can also use batching, but since they are moving you would have to create every frame and send it to VRAM (Batching is done in RAM). So, dynamic objects are handled using other methods like instancing. Multiple small meshes of the same object should be combined beforehand and multiple different meshes with the same render state should be batched in the engine. Another thing to keep in mind is combining meshes and making them too big, this can decrease performance since the whole object needs to be rendered even if a small part is visible to the camera when using culling.

For handling multiple materials on a single mesh, a layered shader can be used. This means having two or multiple materials on the same shader. The materials are blended into each other using a blend texture, but this is expensive on the GPU. It however reduces draw call count as the mesh is no longer ripped into pieces. But because of how heavy the resulting layered material is, it is better to have more draw calls than a layered material in the case of mobile devices.

When doing engine batching, static batching incurs memory overhead, and dynamic batching incurs some CPU overhead.

Unity Specific draw call optimization

The above-written article is useful for most game engines. That being said, here are some unity specific settings related to draw call optimization.

You can view the number of draw calls in unity under the stats in-game window. As you can see Batches is 4 and the saved batches are 0 in the picture below. This means the scene has 4 draw calls and 0 draw calls are being saved by batching.

You can toggle batching option in Unity. The option can be found under project settings > Player > Other Settings > Static Batching/Dynamic Batching.

For static batching, tick the static flag from the inspector for meshes that are static.

Dynamic batching(meshes) in unity is done automatically and doesn't require any additional effort on your side. Moving GameObjects are batched into the same draw call if they share the same Material.

GPU Instancing can also help with minimizing draw calls. GPU instancing help with rendering multiple copies of same objects using fewer draw call. The option (Use GPU Instancing) can be found under the material tab. The tick box is only displayed in material that supports GPU instancing.

Below you can see a scene with no draw call optimization.

And now the same scene with GPU Instancing enabled.

As you can see, we went from 2005 batches to 24 batches, and the CPU utilization went down from 17ms to 11ms! (Note: stats shown here have editor overhead, so it is recommended to use a profiler for accurate statistics)

In unity, shadow casters use dynamic batching even with different materials, as long as the values needed by the shadow pass are the same. This means shadow casters can be batched together, even if their material is different.

Conclusion

To summarize, avoid small meshes, combine them when possible. Avoid too many materials, big texture atlas can help. Additionally, you can consult your graphics programmer on poly count sweet spot (meshes below this triangle count aren't rendered faster), ask for in-game statistics, so you can analyze how problematic your assets are and even on how to better set up your assets.

Thanks a lot for reading. If you've any queries then please comment below.

Optimizing Game using Unity Draw Call Management

Manjil Thapa

Introduction

Example

The problem

The solution

Unity Specific draw call optimization

Conclusion

Device Farming in the QA Process

Art Fundamentals: Shape Language in Character Design

Listeners in JMeter