On Sun, Feb 5, 2023 at 4:45 PM Andrey Borodin <amborodi...@gmail.com> wrote: > Here's v24 == (v23 + a step for pg_amcheck). There's a lot of > shotgun-style changes, but I hope next index types will be easy to add > now.
Some feedback on the GiST patch: * You forgot to initialize GistCheckState.heaptuplespresent to 0. It might be better to allocate GistCheckState dynamically, using palloc0(). That's future proof. "Simple and obvious" is usually the most important goal for managing memory in amcheck code. It can be a little inefficient if that makes it simpler. * ISTM that gist_index_check() should allow the caller to omit a "heapallindexed" argument by specifying "DEFAULT FALSE", for consistency with bt_index_check(). (Actually there are two versions of bt_index_check(), with overloading, but that's just because of the way that the extension evolved over time). * What's the point in having a custom memory context that is never reset? I believe that gistgetadjusted() will leak memory here, so there is a need for some kind of high level strategy for managing memory. The strategy within verify_nbtree.c is to call MemoryContextReset() right after every loop iteration within bt_check_level_from_leftmost() -- which is pretty much once every call to bt_target_page_check(). That kind of approach is obviously not going to suffer any memory leaks. Again, "simple and obvious" is good for memory management in amcheck. * ISTM that it would be clearer if the per-page code within gist_check_parent_keys_consistency() was broken out into its own function -- a little like bt_target_page_check().. That way the control flow would be easier to understand when looking at the code at a high level. * ISTM that gist_refind_parent() should throw an error about corruption in the event of a parent page somehow becoming a leaf page. Obviously this is never supposed to happen, and likely never will happen, even with corruption. But it seems like a good idea to make the most conservative possible assumption by throwing an error. If it never happens anyway, then the fact that we handle it with an error won't matter -- so the error is harmless. If it does happen then we'll want to hear about it as soon as possible -- so the error is useful. * I suggest using c99 style variable declarations in loops. Especially for things like "for (OffsetNumber offset = FirstOffsetNumber; ... ; ... )". -- Peter Geoghegan