Thursday 14 April 2011

Balancing Deletions

Those who have known me for a long time are familiar with Rhys' First Law of Balanced Trees:

There is no such thing as a balanced tree delete algorithm.

Since it was first codified in 1989 the law has seen wide application: AVL trees, B-trees, red-black trees, you name it.  Open up any textbook on algorithms and you'll usually find something like this (Knuth, volume 3, section 6.2.4):

Exercises
...
7. [23] Design a deletion algorithm for B-trees.
"Algorithms in C++", Sedgewick, 1992 doesn't even mention deletion in the section on red-black trees.

Try it with your textbook.  Deletion is always left as an exercise for the reader.  I began to suspect that none of the authors had ever managed to implement deletion themselves.  But rather than admit ignorance they instead preferred to torture their students with mythical algorithms that were "too big to fit in the margin".

Over the years, I just gave up and added a boolean flag, "deleted", to every node and ignored the heap space cost.  Whenever someone has said "You're wrong, libfoobar implements it", I've looked and found only unreadable heuristic gobbledegook.  Nothing like a textbook treatment.  "Trust me!  It works!".  Sure it does Sonny.

Well, after 20-odd years, the law has finally been disproved.  Wikipedia's page on red-black trees actually includes code and a decent description of how deletion works.  After wading through the implementation and writing tons of unit tests, I can confirm that it does indeed work as advertised.

For the first time in my life I've actually implemented a balanced tree delete that actually works!  Woohoo!

I can now use red-black trees to do clause indexing for quickly locating a relevant clause amongst thousands (or even gabillions!) of Plang database facts.

Code here https://github.com/rweather/plang/blob/master/src/libplang/rbtree.c

References: Wikipedia, Literate Programs.

Saturday 9 April 2011

Back on the air

It's been a while since I was last hacking on pure Free Software for myself.  My last big community project was DotGNU Portable.NET.  After some years working on Qt for Trolltech and then Nokia, I am now returning to my roots in the community.

These days I am playing around with logic programming languages.  In particular, a little thing I am calling Plang (pronounced "P lang", although I might sometimes humourously refer to it as "R-Prolog").  At the moment, it is basically an experiment - can we make a more productive logic programming language that uses the same semantic core as Prolog, but a completely new C-style syntax on top?  Dunno, but should be fun to find out.

More information coming in the future as the experiment takes shape.  For now, you can access the sources from the git repository at github.