btree manages to avoid calling the index support functions during WAL
restore --- can't you do the same?  Perhaps you need to be including
more information in the initial xlog records, so that the cleanup step
has everything it needs.  Or resort to brute-force search (which is more
or less what btree does).  I don't think this operation needs to be very
efficient, since it's a corner case that should only seldom be invoked.

I've just commited WALogging for GiST. It works for online backup (normal recovery) and mostly after crash, but in this case it can't restore inclompleted inserts although it can detected and say to log thet it's needed to reindex GiST index.

The problem with incopmleted inserts is: when new entry is installed into leaf page, all chain (in a worst case) of keys from root page to leaf should be updated. Of course, case of updating all chain is rarely and usially it's updated only part. Each key on inner pages contains union of keys (bounding box in a case of rtree, for example) on page which it points. This union can formed only with a help of user-defined function of opclass, because of GiST doesn't know something about nature of keys. Returning to WAL, GiST core write xlog entry with all nessary information for restoration before write page, but in this moment it doesn't know it should update keys on parent page or key is unchanged. So GiST's WAL restoration code should remember this page's update as incompleted insert. When insert complited, GiST's core write to log that insert is completed and restore code can clean up stored incompleted insert. If it was crash, then sign of completed insert can be absent in log, and GiST's restore code should continue it. While continue, it's know which page was changed and should walk up to root. On each step of walk it should form union for page and insert it to parent.

This is absent in btree, because of btree is sorted, so parent's key simply is right on child page and xlog code doesn't need to call any support function.


Which ways I see:
1 Leave it as is. Incomplited inserts isn't often case, and code can detect it to suggest reindex. 2 In principle, its possible to collect all information about changes per insert in the tree in one WAL record. But itsn't good solution, because of it needs to ugly coding of inserts: collect all pages and write it after pushing WAL record. Second, this manner kills using LSN for page control for concurrency. But it fully solve problem with incomplete insert. 3 Construct relation by XLogOpenRelation some more full than now. In fact, restore code needs only 2 thing about index: index->rd_att and information for working index_getprocinfo. I understand why idea to calling any support functions during WAL recovery is bad. 4 Add one hook to finalyze WAL code. This hook should be called when system catalog is maded accessible. 5 There is one unused bit in IndexTupleData. GiST code can use it as mark that this tuple contains incorrect union key. In this case, GiST search algorithm should keep it mind that such tuple has incorrect value and always should go to it's child to check. During insertions in tree such tuples will be little by little changed to correct. Defect of this scheme will be a possible performance degradation after crash recovery. In other hand, it's possible to add recreation such tuples to vacuum code (in gistbulkdelete).





--
Teodor Sigaev                                   E-mail: [EMAIL PROTECTED]
                                                   WWW: http://www.sigaev.ru/

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Reply via email to