On Sun, Jan 17, 2021 at 9:18 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > After more thought, I think that ambulkdelete needs to be able to > refer the answer to amvacuumstrategy. That way, the index can skip > bulk-deletion when lazy vacuum doesn't vacuum heap and it also doesn’t > want to do that.
Makes sense. BTW, your patch has bitrot already. Peter E's recent pageinspect commit happens to conflict with this patch. It might make sense to produce a new version that just fixes the bitrot, so that other people don't have to deal with it each time. > I’ve attached the updated version patch that includes the following changes: Looks good. I'll give this version a review now. I will do a lot more soon. I need to come up with a good benchmark for this, that I can return to again and again during review as needed. Some feedback on the first patch: * Just so you know: I agree with you about handling VACOPT_TERNARY_DISABLED in the index AM's amvacuumstrategy routine. I think that it's better to do that there, even though this choice may have some downsides. * Can you add some "stub" sgml doc changes for this? Doesn't have to be complete in any way. Just a placeholder for later, that has the correct general "shape" to orientate the reader of the patch. It can just be a FIXME comment, plus basic mechanical stuff -- details of the new amvacuumstrategy_function routine and its signature. Some feedback on the second patch: * Why do you move around IndexVacuumStrategy in the second patch? Looks like a rebasing oversight. * Actually, do we really need the first and second patches to be separate patches? I agree that the nbtree patch should be a separate patch, but dividing the first two sets of changes doesn't seem like it adds much. Did I miss some something? * Is the "MAXALIGN(sizeof(ItemIdData)))" change in the definition of MaxHeapTuplesPerPage appropriate? Here is the relevant section from the patch: diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h index 7c62852e7f..038e7cd580 100644 --- a/src/include/access/htup_details.h +++ b/src/include/access/htup_details.h @@ -563,17 +563,18 @@ do { \ /* * MaxHeapTuplesPerPage is an upper bound on the number of tuples that can * fit on one heap page. (Note that indexes could have more, because they - * use a smaller tuple header.) We arrive at the divisor because each tuple - * must be maxaligned, and it must have an associated line pointer. + * use a smaller tuple header.) We arrive at the divisor because each line + * pointer must be maxaligned. *** SNIP *** #define MaxHeapTuplesPerPage \ - ((int) ((BLCKSZ - SizeOfPageHeaderData) / \ - (MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData)))) + ((int) ((BLCKSZ - SizeOfPageHeaderData) / (MAXALIGN(sizeof(ItemIdData))))) It's true that ItemIdData structs (line pointers) are aligned, but they're not MAXALIGN()'d. If they were then the on-disk size of line pointers would generally be 8 bytes, not 4 bytes. * Maybe it would be better if you just changed the definition such that "MAXALIGN(SizeofHeapTupleHeader)" became "MAXIMUM_ALIGNOF", with no other changes? (Some variant of this suggestion might be better, not sure.) For some reason that feels a bit safer: we still have an "imaginary tuple header", but it's just 1 MAXALIGN() quantum now. This is still much less than the current 3 MAXALIGN() quantums (i.e. what MaxHeapTuplesPerPage treats as the tuple header size). Do you think that this alternative approach will be noticeably less effective within vacuumlazy.c? Note that you probably understand the issue with MaxHeapTuplesPerPage for vacuumlazy.c better than I do currently. I'm still trying to understand your choices, and to understand what is really important here. * Maybe add a #define for the value 0.7? (I refer to the value used in choose_vacuum_strategy() to calculate a "this is the number of LP_DEAD line pointers that we consider too many" cut off point, which is to be applied throughout lazy_scan_heap() processing.) * I notice that your new lazy_vacuum_table_and_indexes() function is the only place that calls lazy_vacuum_table_and_indexes(). I think that you should merge them together -- replace the only remaining call to lazy_vacuum_table_and_indexes() with the body of the function itself. Having a separate lazy_vacuum_table_and_indexes() function doesn't seem useful to me -- it doesn't actually hide complexity, and might even be harder to maintain. * I suggest thinking about what the last item will mean for the reporting that currently takes place in lazy_vacuum_table_and_indexes(), but will now go in an expanded lazy_vacuum_table_and_indexes() -- how do we count the total number of index scans now? I don't actually believe that the logic needs to change, but some kind of consolidation and streamlining seems like it might be helpful. Maybe just a comment that says "note that all index scans might just be no-ops because..." -- stuff like that. * Any idea about how hard it will be to teach is_wraparound VACUUMs to skip index cleanup automatically, based on some practical/sensible criteria? It would be nice to have a basic PoC for that, even if it remains a PoC for the foreseeable future (i.e. even if it cannot be committed to Postgres 14). This feature should definitely be something that your patch series *enables*. I'd feel good about having covered that question as part of this basic design work if there was a PoC. That alone should make it 100% clear that it's easy to do (or no harder than it should be -- it should ideally be compatible with your basic design). The exact criteria that we use for deciding whether or not to skip index cleanup (which probably should not just be "this VACUUM is is_wraparound, good enough" in the final version) may need to be debated at length on pgsql-hackers. Even still, it is "just a detail" in the code. Whereas being *able* to do that with your design (now or in the future) seems essential now. > * Store the answers to amvacuumstrategy into either the local memory > or DSM (in parallel vacuum case) so that ambulkdelete can refer the > answer to amvacuumstrategy. > * Fix regression failures. > * Update the documentation and commments. > > Note that 0003 patch is still PoC quality, lacking the btree meta page > version upgrade. This patch is not the hard part, of course -- there really isn't that much needed here compared to vacuumlazy.c. So this patch seems like the simplest 1 out of the 3 (at least to me). Some feedback on the third patch: * The new btm_last_deletion_nblocks metapage field should use P_NONE (which is 0) to indicate never having been vacuumed -- not InvalidBlockNumber (which is 0xFFFFFFFF). This is more idiomatic in nbtree, which is nice, but it has a very significant practical advantage: it ensures that every heapkeyspace nbtree index (i.e. those on recent nbtree versions) can be treated as if it has the new btm_last_deletion_nblocks field all along, even when it actually built on Postgres 12 or 13. This trick will let you avoid dealing with the headache of bumping BTREE_VERSION, which is a huge advantage. Note that this is the same trick I used to avoid bumping BTREE_VERSION when the btm_allequalimage field needed to be added (for the nbtree deduplication feature added to Postgres 13). * Forgot to do this in the third patch (think I made this same mistake once myself): diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c index 8bb180bbbe..88dfea9af3 100644 --- a/contrib/pageinspect/btreefuncs.c +++ b/contrib/pageinspect/btreefuncs.c @@ -653,7 +653,7 @@ bt_metap(PG_FUNCTION_ARGS) BTMetaPageData *metad; TupleDesc tupleDesc; int j; - char *values[9]; + char *values[10]; Buffer buffer; Page page; HeapTuple tuple; @@ -734,6 +734,11 @@ bt_metap(PG_FUNCTION_ARGS) That's all I have for now... -- Peter Geoghegan