Heikki Linnakangas <hlinnakan...@vmware.com> wrote: > Here's a patch implementing this scheme.
I've read through the papers on the btree technique, read the thread, and have been reviewing the patch. It will take me another day or two to complete a close review and testing, but I wanted to give preliminary results, and just let people know that I am working on it. After reading through the papers and the README for btree, I have to say that the approach to deletion seemed like the weak link. Searches, scans, and insertions all were pretty cohesive. The logic was all based on the fact that missing *upper* levels was OK because the sibling pointers would be automatically used to search until the correct page was found at each level. The old deletion approach turned this on its head by having missing pages at the bottom, which introduces whole new classes of problems which otherwise don't exist. This patch makes interim states in the deletion process look almost exactly the same as interim states in the insertion process, which I think is A Very Good Thing. The approach to limiting the number of locks to a finite (and small) number is clever, but seems quite sound. The locks are always taken in an order which cannot cause a deadlock. The write-up lacks any explicit mention of what happens if the branch being considered for deletion reaches all the way to the root. Although an astute reader will notice that since root page will always be the only page at its level, it must always be the rightmost page at its level, and therefore is correctly handled by the logic dealing with that, I think it merits an explicit mention in the README. I agree with others that this patch is not suitable for back-patching. I wonder whether some other solution might be worth back-patching, since it is clearly a bug. The "never delete internal pages" might be safe enough to consider. It would lead to some degree of bloat, but perhaps bloat is better than errors. Of course, the argument can certainly be made that people hit the bug so rarely that we're better off just leaving it alone in stable branches. Anyway, that would be a different patch. More on *this* patch in a day or two. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers