On Mon, Jan 22, 2018 at 6:07 PM, Michael Paquier <michael.paqu...@gmail.com> wrote: > Yep. I have provided the feedback I wanted for 0001 (no API change in > the bloom facility by the way :( ), but I still wanted to look at 0002 > in depths.
I don't see a point in adding complexity that no caller will actually use. It *might* become useful in the future, in which case it's no trouble at all to come up with an alternative initialization routine. Anyway, parallel CREATE INDEX added a new "scan" argument to IndexBuildHeapScan(), which caused this patch to bitrot. At a minimum, an additional NULL argument should be passed by amcheck. However, I have a better idea. ISTM that verify_nbtree.c should manage the heap scan itself, it the style of parallel CREATE INDEX. It can acquire its own MVCC snapshot for bt_index_check() (which pretends to be a CREATE INDEX CONCURRENTLY). There can be an MVCC snapshot acquired per index verified, a snapshot that is under the direct control of amcheck. The snapshot would be acquired at the start of verification on an index (when "heapallindex = true"), before the verification of the index structure even begins, and released at the very end of verification. Advantages of this include: 1. It simplifies the code, and in particular lets us remove the use of TransactionXmin. Comments already say that this TransactionXmin business is a way of approximating our own MVCC snapshot acquisition -- why not *actually do* the MVCC snapshot acquisition, now that that's possible? 2. It makes the code for bt_index_check() and bt_index_parent_check() essentially identical, except for varying IndexBuildHeapScan()/indexInfo parameters to match what CREATE INDEX itself does. The more we can outsource to IndexBuildHeapScan(), the better. 3. It ensures that transactions that heapallindexed verify many indexes in one go are at no real disadvantage to transactions that heapallindexed verify a single index, since TransactionXmin going stale won't impact verification (we won't have to skip Bloom filter probes due to the uncertainty about what should be in the Bloom filter). 4. It will make parallel verification easier in the future, which is something that we ought to make happen. Parallel verification would target a table with multiple indexes, and use a parallel heap scan. It actually looks like making this work would be fairly easy. We'd only need to copy code from nbtsort.c, and arrange for parallel workers to verify an index each ahead of the heap scan. (There would be multiple Bloom filters in shared memory, all of which parallel workers end up probing.) Thoughts? -- Peter Geoghegan