Hi, Heikki!

Thanks for looking into the patch!

> 11 июля 2018 г., в 0:07, Heikki Linnakangas <hlinn...@iki.fi> написал(а):
> 
> I'm now looking at the first patch in this series, to allow completely empty 
> GiST pages to be recycled. I've got some questions:
> 
>> --- a/src/backend/access/gist/gist.c
>> +++ b/src/backend/access/gist/gist.c
>> @@ -700,6 +700,13 @@ gistdoinsert(Relation r, IndexTuple itup, Size 
>> freespace, GISTSTATE *giststate)
>>                      GISTInsertStack *item;
>>                      OffsetNumber downlinkoffnum;
>> +                    if(GistPageIsDeleted(stack->page))
>> +                    {
>> +                            UnlockReleaseBuffer(stack->buffer);
>> +                            xlocked = false;
>> +                            state.stack = stack = stack->parent;
>> +                            continue;
>> +                    }
>>                      downlinkoffnum = gistchoose(state.r, stack->page, itup, 
>> giststate);
>>                      iid = PageGetItemId(stack->page, downlinkoffnum);
>>                      idxtuple = (IndexTuple) PageGetItem(stack->page, iid);
> 
> This seems misplaced. This code deals with internal pages, and as far as I 
> can see, this patch never marks internal pages as deleted, only leaf pages. 
> However, we should have something like this in the leaf-page branch, to deal 
> with the case that an insertion lands on a page that was concurrently deleted.
That's a bug. Will fix this.

> Did you have any tests, where an insertion runs concurrently with vacuum, 
> that would exercise this?
Yes, I've tried to test this, but, obviously, not enough. I'll think more about 
how to deal with it.

> 
> The code in gistbulkdelete() seems pretty expensive. In the first phase, it 
> records the parent of every empty leaf page it encounters. In the second 
> phase, it scans every leaf page of that parent, not only those leaves that 
> were seen as empty.
Yes, in first patch there is simplest gistbulkdelete(), second patch will 
remember line pointers of downlinks to empty leafs.

> 
> I'm a bit wary of using pd_prune_xid for the checks to determine if a deleted 
> page can be recycled yet. In heap pages, pd_prune_xid is just a hint, but 
> here it's used for a critical check. This seems to be the same mechanism we 
> use in B-trees, but in B-trees, we store the XID in BTPageOpaqueData.xact, 
> not pd_prune_xid. Also, in B-trees, we use ReadNewTransactionId() to set it, 
> not GetCurrentTransactionId(). See comments in _bt_unlink_halfdead_page() for 
> explanation. This patch is missing any comments to explain how this works in 
> GiST.
Will look into this. I remember it was OK half a year ago, but now it is clear 
to me that I had to document that part when I understood it....

> 
> If you crash in the middle of gistbulkdelete(), after it has removed the 
> downlink from the parent, but before it has marked the leaf page as deleted, 
> the leaf page is "leaked". I think that's acceptable, but a comment at least 
> would be good.
I was considering doing reverse-split (page merge) concurrency like in Lanin 
and Shasha's paper, but it is just too complex for little purpose. Will add 
comments on possible orphan pages.

Many thanks! I hope to post updated patch series this week.


Best regards, Andrey Borodin.

Reply via email to