On Wed, Jan 20, 2021 at 7:58 AM Peter Geoghegan <p...@bowt.ie> wrote: > > On Sun, Jan 17, 2021 at 9:18 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > After more thought, I think that ambulkdelete needs to be able to > > refer the answer to amvacuumstrategy. That way, the index can skip > > bulk-deletion when lazy vacuum doesn't vacuum heap and it also doesn’t > > want to do that. > > Makes sense. > > BTW, your patch has bitrot already. Peter E's recent pageinspect > commit happens to conflict with this patch. It might make sense to > produce a new version that just fixes the bitrot, so that other people > don't have to deal with it each time. > > > I’ve attached the updated version patch that includes the following changes: > > Looks good. I'll give this version a review now. I will do a lot more > soon. I need to come up with a good benchmark for this, that I can > return to again and again during review as needed.
Thank you for reviewing the patches. > > Some feedback on the first patch: > > * Just so you know: I agree with you about handling > VACOPT_TERNARY_DISABLED in the index AM's amvacuumstrategy routine. I > think that it's better to do that there, even though this choice may > have some downsides. > > * Can you add some "stub" sgml doc changes for this? Doesn't have to > be complete in any way. Just a placeholder for later, that has the > correct general "shape" to orientate the reader of the patch. It can > just be a FIXME comment, plus basic mechanical stuff -- details of the > new amvacuumstrategy_function routine and its signature. > 0002 patch had the doc update (I mistakenly included it to 0002 patch). Is that update what you meant? > Some feedback on the second patch: > > * Why do you move around IndexVacuumStrategy in the second patch? > Looks like a rebasing oversight. Check. > > * Actually, do we really need the first and second patches to be > separate patches? I agree that the nbtree patch should be a separate > patch, but dividing the first two sets of changes doesn't seem like it > adds much. Did I miss some something? I separated the patches since I used to have separate patches when adding other index AM options required by parallel vacuum. But I agreed to merge the first two patches. > > * Is the "MAXALIGN(sizeof(ItemIdData)))" change in the definition of > MaxHeapTuplesPerPage appropriate? Here is the relevant section from > the patch: > > diff --git a/src/include/access/htup_details.h > b/src/include/access/htup_details.h > index 7c62852e7f..038e7cd580 100644 > --- a/src/include/access/htup_details.h > +++ b/src/include/access/htup_details.h > @@ -563,17 +563,18 @@ do { \ > /* > * MaxHeapTuplesPerPage is an upper bound on the number of tuples that can > * fit on one heap page. (Note that indexes could have more, because they > - * use a smaller tuple header.) We arrive at the divisor because each tuple > - * must be maxaligned, and it must have an associated line pointer. > + * use a smaller tuple header.) We arrive at the divisor because each line > + * pointer must be maxaligned. > *** SNIP *** > #define MaxHeapTuplesPerPage \ > - ((int) ((BLCKSZ - SizeOfPageHeaderData) / \ > - (MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData)))) > + ((int) ((BLCKSZ - SizeOfPageHeaderData) / > (MAXALIGN(sizeof(ItemIdData))))) > > It's true that ItemIdData structs (line pointers) are aligned, but > they're not MAXALIGN()'d. If they were then the on-disk size of line > pointers would generally be 8 bytes, not 4 bytes. You're right. Will fix. > > * Maybe it would be better if you just changed the definition such > that "MAXALIGN(SizeofHeapTupleHeader)" became "MAXIMUM_ALIGNOF", with > no other changes? (Some variant of this suggestion might be better, > not sure.) > > For some reason that feels a bit safer: we still have an "imaginary > tuple header", but it's just 1 MAXALIGN() quantum now. This is still > much less than the current 3 MAXALIGN() quantums (i.e. what > MaxHeapTuplesPerPage treats as the tuple header size). Do you think > that this alternative approach will be noticeably less effective > within vacuumlazy.c? > > Note that you probably understand the issue with MaxHeapTuplesPerPage > for vacuumlazy.c better than I do currently. I'm still trying to > understand your choices, and to understand what is really important > here. Yeah, using MAXIMUM_ALIGNOF seems better for safety. I shared my thoughts on the issue with MaxHeapTuplesPerPage yesterday. I think we need to discuss how to deal with that. > > * Maybe add a #define for the value 0.7? (I refer to the value used in > choose_vacuum_strategy() to calculate a "this is the number of LP_DEAD > line pointers that we consider too many" cut off point, which is to be > applied throughout lazy_scan_heap() processing.) > Agreed. > * I notice that your new lazy_vacuum_table_and_indexes() function is > the only place that calls lazy_vacuum_table_and_indexes(). I think > that you should merge them together -- replace the only remaining call > to lazy_vacuum_table_and_indexes() with the body of the function > itself. Having a separate lazy_vacuum_table_and_indexes() function > doesn't seem useful to me -- it doesn't actually hide complexity, and > might even be harder to maintain. lazy_vacuum_table_and_indexes() is called at two places: after maintenance_work_mem run out (around L1097) and the end of lazy_scan_heap() (around L1726). I defined this function to pack the operations such as choosing a strategy, vacuuming indexes and vacuuming heap. Without this function, will we end up writing the same codes twice there? > > * I suggest thinking about what the last item will mean for the > reporting that currently takes place in > lazy_vacuum_table_and_indexes(), but will now go in an expanded > lazy_vacuum_table_and_indexes() -- how do we count the total number of > index scans now? > > I don't actually believe that the logic needs to change, but some kind > of consolidation and streamlining seems like it might be helpful. > Maybe just a comment that says "note that all index scans might just > be no-ops because..." -- stuff like that. What do you mean by the last item and what report? I think lazy_vacuum_table_and_indexes() itself doesn't report anything and vacrelstats->num_index_scan counts the total number of index scans. > > * Any idea about how hard it will be to teach is_wraparound VACUUMs to > skip index cleanup automatically, based on some practical/sensible > criteria? One simple idea would be to have a to-prevent-wraparound autovacuum worker disables index cleanup (i.g., setting VACOPT_TERNARY_DISABLED to index_cleanup). But a downside (but not a common case) is that since a to-prevent-wraparound vacuum is not necessarily an aggressive vacuum, it could skip index cleanup even though it cannot move relfrozenxid forward. > > It would be nice to have a basic PoC for that, even if it remains a > PoC for the foreseeable future (i.e. even if it cannot be committed to > Postgres 14). This feature should definitely be something that your > patch series *enables*. I'd feel good about having covered that > question as part of this basic design work if there was a PoC. That > alone should make it 100% clear that it's easy to do (or no harder > than it should be -- it should ideally be compatible with your basic > design). The exact criteria that we use for deciding whether or not to > skip index cleanup (which probably should not just be "this VACUUM is > is_wraparound, good enough" in the final version) may need to be > debated at length on pgsql-hackers. Even still, it is "just a detail" > in the code. Whereas being *able* to do that with your design (now or > in the future) seems essential now. Agreed. I'll write a PoC patch for that. > > > * Store the answers to amvacuumstrategy into either the local memory > > or DSM (in parallel vacuum case) so that ambulkdelete can refer the > > answer to amvacuumstrategy. > > * Fix regression failures. > > * Update the documentation and commments. > > > > Note that 0003 patch is still PoC quality, lacking the btree meta page > > version upgrade. > > This patch is not the hard part, of course -- there really isn't that > much needed here compared to vacuumlazy.c. So this patch seems like > the simplest 1 out of the 3 (at least to me). > > Some feedback on the third patch: > > * The new btm_last_deletion_nblocks metapage field should use P_NONE > (which is 0) to indicate never having been vacuumed -- not > InvalidBlockNumber (which is 0xFFFFFFFF). > > This is more idiomatic in nbtree, which is nice, but it has a very > significant practical advantage: it ensures that every heapkeyspace > nbtree index (i.e. those on recent nbtree versions) can be treated as > if it has the new btm_last_deletion_nblocks field all along, even when > it actually built on Postgres 12 or 13. This trick will let you avoid > dealing with the headache of bumping BTREE_VERSION, which is a huge > advantage. > > Note that this is the same trick I used to avoid bumping BTREE_VERSION > when the btm_allequalimage field needed to be added (for the nbtree > deduplication feature added to Postgres 13). > That's a nice way with a great advantage. I'll use P_NONE. > * Forgot to do this in the third patch (think I made this same mistake > once myself): > > diff --git a/contrib/pageinspect/btreefuncs.c > b/contrib/pageinspect/btreefuncs.c > index 8bb180bbbe..88dfea9af3 100644 > --- a/contrib/pageinspect/btreefuncs.c > +++ b/contrib/pageinspect/btreefuncs.c > @@ -653,7 +653,7 @@ bt_metap(PG_FUNCTION_ARGS) > BTMetaPageData *metad; > TupleDesc tupleDesc; > int j; > - char *values[9]; > + char *values[10]; > Buffer buffer; > Page page; > HeapTuple tuple; > @@ -734,6 +734,11 @@ bt_metap(PG_FUNCTION_ARGS) Check. I'm updating and testing the patch. I'll submit the updated version patches tomorrow. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/