Tuesday, January 27, 2009

Optimal binary tree walking in C++

Binary tree walking

A binary tree is a means of storing data that usually offers a close-to-optimal search time when looking for a particular item in the tree (O log (n)). But it can become necessary to traverse the entire tree, usually for debugging, analysis, or simply to print the contents.

Typically, binary tree walking can be pretty straightforward, but there are some gotchas and a few interesting implementations that you may want to consider. Sometimes, these sneaky questions make it into job interviews and these implementations are sometimes useful to know besides, even if you only know that they exist. Let’s say that we are given the following BNode class and a tree that is arranged as follows. Not all binary trees appear like this class, most notably the boost library implementation which balances the tree. Using the Boost implementation is a little ugly (IMHO), but it is a very robust version and should be considered.

//-------------------------------------------------------
class BNode
{
public:
int Value;
BNode* left;
BNode* right;

BNode () : Value (0), left (0), right (0) {}
void Print () {cout << Value << ", ";}

};
//-------------------------------------------------------

How can we walk the tree, visit every node, and output it’s value? We’ll look at three possibilities.

Setup

The normal initialization is for the nodes to be null and in our example tree, all of the following nodes would have both left and right nodes still equal to null after initialization. Because we are discussing how to fill the tree like in our diagram, I’ll take the easy approach and do it manually just so that we can be sure that everything is setup properly to walk our binary tree. This is not normally how you might fill a binary tree.

BNode* BuildSimpleTree ()
{
BNode* head = new BNode ();
head->Value = 0;

BNode* node1 = new BNode ();
node1->Value = 1;
head->left = node1;

BNode* node2 = new BNode ();
node2->Value = 2;
head->right = node2;

BNode* node = new BNode ();
node->Value = 3;
node1->left = node;

node = new BNode ();
node->Value = 4;
node1->right = node;

node = new BNode ();
node->Value = 5;
node2->left = node;

node = new BNode ();
node->Value = 6;
node2->right = node;

return head;
}

Walking (traversing)

We can walk this tree and output the values in a few different ways. First, let me note that when walking the tree, there are three ways that are acceptable, but only one widely used. As an example, given node 1, you could output any of the following: 1-3-4, 3-1-4, or 3-4-1. These ways of viewing binary tree output (and possibly storage) is known as pre-order, in-order, and post-order sorted. [Wikipedia]

Considering only pre-order for our purposes, the proper output for this tree should be: 0-1-3-4-2-5-6. This would be a typical depth first output. The oft-answered solution is one of recursion. It looks like this:

//-------------------------------------------------------
void PrintTreeDepthFirstRecursive (BNode* head)
{
if (head != 0)
{
head->Print (); // print the current node
if (head->left) // go left
{
PrintTreeDepthFirstRecursive (head->left);
}
if (head->right) // go right
{
PrintTreeDepthFirstRecursive (head->right);
}
}
}
//-------------------------------------------------------

Recursion solves several problems for us by pushing our currently used node onto the stack. In fact, this solution is so elegant, it seems a shame to try and improve upon it. Still, there are other solutions and they do solve other problems.

Walking (v2)

Sometimes, you don’t have recursion available. An interviewer once gave me the question “what if I don’t have a very deep call stack, because Lego Mindstorms has almost no depth to their call stack, what do I do?” Lego has since adopted a deeper stack, but the problem remains: how can we walk a binary tree without recursion? Another important consideration is performance. The recursive call uses function calls to solve the stack issues and while elegant, every call has function call overhead, stack pushing and popping, and probably a data cache hit.

If we think a little deeper about why we’re using recursion, we can readily recognize that we are storing our position in the binary tree on the stack. Well, if we add a stack to local storage, we can avoid recursion and the overhead that it incurs. There is the requirement in this implementation for the stl, but if it’s not available, then nearly any stack implementation will do. NOTE: When using stl::stack, be sure to access the top and then pop it: they are separate and easily lead to bugs if you don’t perform both funtions. Top is for viewing, and pop is for removing.

Note how similar this is to the previous example. The differences are definitely worth noting however. We initialize our stack with the top-most node. This makes the loop very tidy by requirig us to have something on our stack while we are in the loop. Every time we now push the left or right, we immediately pop it and print it. Then we deal with that node independently walking it’s left and right nodes after pushing those on our stack. We’ve already printed that node, so we don’t need it on the stack anymore.

Here is an important change: see the order of the pushes? We push right first because the right should be printed after the left and stacks are fifo collections. Thus, we push the right onto the stack, expecting the left to be pushed in on top of it, and then popped off immediately. If the left doesn’t exist, the the right is popped off next and the proper output is then displayed.

//-------------------------------------------------------
void PrintTreeDepthFirstStacked (BNode* head)
{
std::stack Stack; // this is the key difference

Stack.push (head);
while (Stack.size () != 0)
{
BNode* topNode = Stack.top ();
Stack.pop ();

topNode->Print ();
if (topNode->right) // go right first because we are pushing onto a stack
{
Stack.push (topNode->right);
}
if (topNode->left) // go left
{
Stack.push (topNode->left);
}
}
cout << endl;
}
//-------------------------------------------------------

Walking (no stack and no stl)

So, we have limited memory and/or stack space and no access to stl... what then? In a similar fashion to our last implementation, we need a stack. Well, implementing a stack can be as simple as pushing data into an array with some way of tracking where the last valid item in the array is. Then we use that tracking item as the top of our stack.

This implementation tracks pointers by pushing those into an array used to track the ‘highest’ item better known as the top. This is very similar in a lot of ways to what’s come before, but there are a few differences of note. You’ll note the prepended decrement operator used for the index on line 7 of the code which must be prepended because the access must be pointing to the previous node.

We are also tracking our position relative to 0 and once we reach 0, we have stepped through every node. While this version is nowhere near as elegant as our first version, it is super-lightweight and must faster than either of the previous two versions. Walking a binary tree is not necessarily slow, but since we are walking every node, it can be: especially for a deep tree. The stackless implementation here is very fast.

//-------------------------------------------------------
void PrintTreeDepthFirstStackless (BNode* head)
{
const int MaxStackDepth = 24; // supports 2^24 nodes, very large
BNode * Stack [MaxStackDepth]; // This is temp storage. No memset.
int StackIndex = 0;

Stack [StackIndex++] = head; // grab head

while (StackIndex > 0)
{
BNode* topNode = Stack [--StackIndex];

topNode->Print ();
if (topNode->right)
{
Stack [StackIndex++] = topNode->right;
}
if (topNode->left)
{
Stack [StackIndex++] = topNode->left;
}
}
cout << endl;
}
//-------------------------------------------------------

Concluding notes

All designs are based on competing levels of quality, performance, maintenance, and cost. Most of the engineers of the world go for the easiest to implement or the fastest performance, but there are always other options.

While all considerations help when thinking about your design of a binary tree walker, overall design and hardware limitations should be your primary concerns. Then you should consider maintenance your next heavy cost. After that, then you should consider run time performance. In my limited testing I found the stackless version to be fastest, followed by the stacked version, and then the recursive version. In fact the recursive version deteriorated in performance rather quickly becoming nearly a 2x performance hit for every new layer of depth added to the binary tree. At very small tree levels (3 level or less) it outperformed the stacked version, but it never outperformed the stackless version.

Good luck.

No comments: