Re: [HACKERS] Nasty btree deletion bug

2006-10-26 Thread Tom Lane
I wrote: > Right, but _bt_getstackbuf is working from a search stack created by > a standard search for the victim page's high key. If that search > descended through a page to the right of the victim page's actual > parent, _bt_getstackbuf isn't able to recover. What I'm tempted to do, at least

Re: [HACKERS] Nasty btree deletion bug

2006-10-26 Thread Tom Lane
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes: > But now that I look at the original post by Ed, I don't see how the > "failed to re-find parent key" error could result from the issue we've > been talking about. The error message is printed when _bt_getstackbuf is > unable to re-find an item i

Re: [HACKERS] Nasty btree deletion bug

2006-10-26 Thread Heikki Linnakangas
Tom Lane wrote: On further reflection, I think I understand why we've not realized the existence of this bug before: in fact, it *doesn't* lead to wrong search answers. I think the only visible consequence is exactly the "failed to re-find parent key" VACUUM error that Ed saw. The reason is tha

Re: [HACKERS] Nasty btree deletion bug

2006-10-26 Thread Tom Lane
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes: > What does the original research paper by Lanin & Shasha say about this? Nothing very useful. The connection of our code to that paper is actually a bit tenuous: their approach to deletion is to make the target page's key space move left not righ

Re: [HACKERS] Nasty btree deletion bug

2006-10-26 Thread Tom Lane
I wrote: > [ looks at that for a bit... ] Yeah, you're right. Once the deletion > is completed, the F lower-bound key will disappear from the grandparent, > which would restore consistency --- but we could have already delivered > wrong search answers, so that won't do. On further reflection, I

Re: [HACKERS] Nasty btree deletion bug

2006-10-26 Thread Heikki Linnakangas
Tom Lane wrote: In theory, given a slow-enough-moving VACUUM process, this could happen even without a crash. So I think that means we have to go over to the other plan of locking everything all the way up to the top of the deletion before we start doing it --- and also, we'll need crash recover

Re: [HACKERS] Nasty btree deletion bug

2006-10-26 Thread Tom Lane
Heikki Linnakangas <[EMAIL PROTECTED]> writes: > I don't understand how this "in the meantime" thing works. I tried to > work out a step-by-step example, could you take a look at it? See > http://users.tkk.fi/~hlinnaka/pgsql/btree-deletion-bug/ [ looks at that for a bit... ] Yeah, you're right.

Re: [HACKERS] Nasty btree deletion bug

2006-10-26 Thread Heikki Linnakangas
Tom Lane wrote: I wrote: I've been analyzing Ed L's recent report of index corruption: http://archives.postgresql.org/pgsql-general/2006-10/msg01183.php Auch. That's nasty indeed. So I think the rule needs to be "don't delete the rightmost child unless it's the only child, in which case you

Re: [HACKERS] Nasty btree deletion bug

2006-10-25 Thread Tom Lane
I wrote: > I've been analyzing Ed L's recent report of index corruption: > http://archives.postgresql.org/pgsql-general/2006-10/msg01183.php After some thought, it still seems to me to be impractical to delete the rightmost child of a btree page while other children remain. Doing this would requi

[HACKERS] Nasty btree deletion bug

2006-10-25 Thread Tom Lane
I've been analyzing Ed L's recent report of index corruption: http://archives.postgresql.org/pgsql-general/2006-10/msg01183.php (thanks to Ed for letting me have the actual index to study). I think I've figured out what happened to it. nbtree/README says : The notion of a half-dead page means tha