Tuesday, March 23, 2010

Software Optimization - one approach

Performance optimization comes in many flavors. My preferred methodologies come in two: architecture and profiling. The architecture aspect means being involved with high level design and always looking for opportunities to remove bottlenecks. In the case of graphics, it may mean knowing when to say No to artists (not very often). As one example of where this is important, at a previous company where I worked, the artists were constantly trying to put in the largest files possible simply because they often have very little awareness of the costs of using large graphics or audio files. Compression helps (mp3 or tiff) but we ended up with one tiff that was 21 megs in our pipeline which simply made our game unrenderable (roughly 1500x1500, 24bit). We failed on our education effort and it continued to happen over the course of months simply because this type of thing is the last sort-of-thing an artist is interested in. Ultimately, in the Maya plugin, we popped-up an error when users had an image over 100k. This helped, but they found ways to work around the system. Finally, as part of the build system, we failed the build when resources that were too large (audio and graphics which were treated by different size constraints) were submitted into Perforce. Our build system always told us which checkins were the one's to 'break the build' so this became much easier to fix the art, identify the problem early (rather than late in production), and educate the artist involved.

Architecture, can be more than designing clever pipelines tho. It can mean identifying areas of the game which require LOD, if you need it at all. It means packing all LOD models in a way to allow fast replacement (stored key-framed, all models in in one data blob, or whatever), identifying when to compress and when to not compress, defining hot-swapable textures (GUI usually), and so on. In most cases, you are attempting to keep as much data as possible in RAM (faster than DVD) as possible to provide performance and balance that with the performance constraints of decompression or moving too much data around.

Profiling is an easier issue but one of my favorites and I try to do it often. This is a simple matter of never optimizing anything until the need arises. This is in sharp contrast to architecture optimization which tries to prevent optimization bottlenecks up front. When performance drops, optimization is never far away. You begin with a small dose of profiling and nearly all compilers provide some form of profiling. This can mean instrumenting the code with timing blocks (instrumenting the code) when profilers are inadequate. Ultimately, after a few hours, or sometimes days, performance bottlenecks are identified. Most of these are poor loops like a bubble-sort, code that is called repeatedly to look up a value better stored in a hash, complex values calculated rather than stored at compile-time, large case statements (state trees do this), and so on. Once these are identified, fixing them is usually easy. Most of the time, it becomes an algorithmic choice trading an N^2 problem for a logN type of problem. A neat example of this is a recursive version of Fibonacci versus a looping one. The following two algorithms have starkly different run-time performance based on the way that they are implemented:

This first one runs in N time meaning that even for large values of 100 or more, this runs about the same speed.

int fibonacci1 (int n)
{
if (n<1)
return 0;

int _1 = 0;
int _2 = 1;
int total = 0;

for (int i=2; i<=n; i++)
{
total = _1 + _2;
_1 = _2;
_2 = total;
}

return _2;
}


That isn't true here at all. This code is understandable, works, and is dramatically slower and it slows exponentially as N^2.

int fibonacci2 (int n)
{
if(n < 2)
return 1;
return fibonacci2 (n-1) + fibonacci2(n-2);
}


Most optimization usually boils down to architecture choices and algorithmic ones. Once in a while, I have the unique pleasure of writing assembly code, but it's been about four years since I have, and I'm afraid I've grown a bit rusty. In the mean-time, I'll stick to higher level stuff.

No comments: