Architecture, can be more than designing clever pipelines tho. It can mean identifying areas of the game which require LOD, if you need it at all. It means packing all LOD models in a way to allow fast replacement (stored key-framed, all models in in one data blob, or whatever), identifying when to compress and when to not compress, defining hot-swapable textures (GUI usually), and so on. In most cases, you are attempting to keep as much data as possible in RAM (faster than DVD) as possible to provide performance and balance that with the performance constraints of decompression or moving too much data around.
Profiling is an easier issue but one of my favorites and I try to do it often. This is a simple matter of never optimizing anything until the need arises. This is in sharp contrast to architecture optimization which tries to prevent optimization bottlenecks up front. When performance drops, optimization is never far away. You begin with a small dose of profiling and nearly all compilers provide some form of profiling. This can mean instrumenting the code with timing blocks (instrumenting the code) when profilers are inadequate. Ultimately, after a few hours, or sometimes days, performance bottlenecks are identified. Most of these are poor loops like a bubble-sort, code that is called repeatedly to look up a value better stored in a hash, complex values calculated rather than stored at compile-time, large case statements (state trees do this), and so on. Once these are identified, fixing them is usually easy. Most of the time, it becomes an algorithmic choice trading an N^2 problem for a logN type of problem. A neat example of this is a recursive version of Fibonacci versus a looping one. The following two algorithms have starkly different run-time performance based on the way that they are implemented:
This first one runs in N time meaning that even for large values of 100 or more, this runs about the same speed.
int fibonacci1 (int n)
{
if (n<1)
return 0;
int _1 = 0;
int _2 = 1;
int total = 0;
for (int i=2; i<=n; i++)
{
total = _1 + _2;
_1 = _2;
_2 = total;
}
return _2;
}
That isn't true here at all. This code is understandable, works, and is dramatically slower and it slows exponentially as N^2.
int fibonacci2 (int n)
{
if(n < 2)
return 1;
return fibonacci2 (n-1) + fibonacci2(n-2);
}
Most optimization usually boils down to architecture choices and algorithmic ones. Once in a while, I have the unique pleasure of writing assembly code, but it's been about four years since I have, and I'm afraid I've grown a bit rusty. In the mean-time, I'll stick to higher level stuff.
No comments:
Post a Comment