On Mon, 10 Feb 2025 at 20:11, Burd, Greg <gregb...@amazon.com> wrote: > > On Feb 10, 2025, at 12:17 PM, Matthias van de Meent > > <boekewurm+postg...@gmail.com> wrote: > > > >> > >> I have a few concerns with the patch, things I’d greatly appreciate your > >> thoughts on: > >> > >> First, I pass an EState along the update path to enable running the checks > >> in heapam, this works but leaves me feeling as if I violated separation of > >> concerns. If there is a better way to do this let me know or if you think > >> the cost of creating one in the execIndexing.c > >> ExecIndexesRequiringUpdates() is okay that’s another possibility. > > > > I think that doesn't have to be bad. > > Meaning that the approach I’ve taken is okay with you?
If you mean "passing EState down through the AM so we can check if we really need HOT updates", then Yes, that's OK. I don't think the basic idea (executing the projections to check for differences) is bad per se, however I do think we may need more design work on the exact shape of how ExecIndexesRequiringUpdates() receives its information (which includes the EState). E.g. we could pass an opaquely typed pointer that's passed from the executor state through the table_update method into this ExecIndexesRequiringUpdates(). That opaque struct would then contain the important information for index update state checking, so that the AM can't realistically break things without bypassing the separation of concerns, and doesn't have to know about any executor nodes. >>> Third, there is overhead to this patch, it is no longer a single simple >>> bitmap test to choose HOT or not in heap_update(). >> >> Why can't it mostly be that simple in simple cases? > > It can remain that simple in the cases you mention. In relcache the hot > blocking attributes are nearly the same, only the summarizing attributes are > removed. The first test then in heap_update() is for overlap with the > modified set. When there is none, the update will proceed on the HOT path. > > The presence of a summarizing index is determined in > ExecIndexesRequiringUpdates() in execIndexing.c, so a slightly longer code > path but not much new overhead. Yes, but that's only determined at an index-by-index level, rather than determined all at once, and that's real bad when you have hundreds of indexes to go through (however unlikely it might be, I've heard of cases where there are 1000s of indexes on one table). So, I prefer limiting any O(n_indexes) operations to only the most critical cases. > > I mean, it's clear that "updated indexed column's value == non-HOT > > update". And that to determine whether an updated *projected* column's > > value (i.e., expression index column's value) was actually updated we > > need to calculate the previous and current index value, thus execute > > the projection twice. But why would we have significant additional > > overhead if there are no expression indexes, or when we can know by > > bitmap overlap that the only interesting cases are summarizing > > indexes? > > You’re right, there’s not a lot of new overhead in that case except what > happens in ExecIndexesRequiringUpdates() to scan over the list of IndexInfo. > It is really only when there are many expressions/predicates requiring > examination that there is any significant cost to this approach AFAICT (but > if you see something please point it out). See the attached approach. Evaluation of the expressions only has to happen if there are any HOT-blocking attributes which are exclusively hot-blockingly indexed through expressions, so if the updated attribute numbers are a subset of hotblockingexprattrs. (substitute hotblocking with summarizing for the summarizing approach) > > I would've implemented this with (1) two new bitmaps, one each for > > normal and summarizing indexes, each containing which columns are > > exclusively used in expression indexes (and which should thus be used > > to trigger the (comparatively) expensive recalculation). > > That was one where I started, over time that became harder to work as the > bitmaps contain the union of index attributes for the table not per-column. I think it's fairly easy to create, though. > Now there is one bitmap to cover the broadest case and then a function to > find the modified set of indexes where each is examined against bitmaps that > contain only attributes specific to the index in question. This helped in > cases where there were both expression and non-expression indexes on the same > attribute. Fair, but do we care about one expression index on (attr1->>'data')'s value *not* changing when an index on (attr1) exists and attr1 has changed? That index on att1 would block HOT updates regardless of the (lack of) changes to the (att1->>'data') index, so doing those expensive calculations seems quite wasteful. So, in my opinion, we should also keep track of those attributes only included in expressions of indexes, and that's fairly easy: see attached prototype.diff.txt (might need some work, the patch was drafted on v16's codebase, but the idea is clear). The resulting %exprattrs bitmap contains attributes that are used only in expressions of those index types. > > The "new" output of these expression > > evaluations would be stored to be used later as index datums, reducing > > the number of per-expression evaluations down to 2 at most, rather > > than 2+1 when the index needs an insertion but the expression itself > > wasn't updated. > > Not reforming the new index tuples is also an interesting optimization. I > wonder how that can be passed from within heapam’s call into a function in > execIndexing up into nodeModifiyTable and back down into execIndexing and on > to the index access method? I’ll have to think about that, ideas welcome. Note that index tuple forming happens only in the index AM, it's the Datum construction (i.e. projection from attributes/tuple to indexed value) that I'd like to deduplicate. Though looking at the current code, I don't think it's reasonable to have that as a requirement for this work. It'd be a nice-to-have for sure, but not as requirement. > > Do you have any documentation on the approaches used, and the specific > > differences between v3 and v4? I don't see much of that in your > > initial mail, and the patches themselves also don't show much of that > > in their details. I'd like at least some documentation of the new > > behaviour in src/backend/access/heap/README.HOT at some point before > > this got marked as RFC in the commitfest app, though preferably sooner > > rather than later. > > Good point, I should have updated README.HOT with the initial patchset. I’ll > jump on that and update ASAP. Thanks in advance. Kind regards, Matthias van de Meent Neon (https://neon.tech)
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index 5a57702e914..bf41e9f53d3 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -5181,8 +5181,10 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind) Bitmapset *uindexattrs; /* columns in unique indexes */ Bitmapset *pkindexattrs; /* columns in the primary index */ Bitmapset *idindexattrs; /* columns in the replica identity */ - Bitmapset *hotblockingattrs; /* columns with HOT blocking indexes */ - Bitmapset *summarizedattrs; /* columns with summarizing indexes */ + Bitmapset *hotblockingattrs; /* columns with HOT blocking indexes */ + Bitmapset *hotblockingexprattrs; /* as above, but only those in expressions */ + Bitmapset *summarizedattrs; /* columns with summarizing indexes */ + Bitmapset *summarizedexprattrs; /* as above, but only those in expressions */ List *indexoidlist; List *newindexoidlist; Oid relpkindex; @@ -5248,7 +5250,9 @@ restart: pkindexattrs = NULL; idindexattrs = NULL; hotblockingattrs = NULL; + hotblockingexprattrs = NULL; summarizedattrs = NULL; + summarizedexprattrs = NULL; foreach(l, indexoidlist) { Oid indexOid = lfirst_oid(l); @@ -5262,6 +5266,7 @@ restart: bool isPK; /* primary key */ bool isIDKey; /* replica identity index */ Bitmapset **attrs; + Bitmapset **exprattrs; indexDesc = index_open(indexOid, AccessShareLock); @@ -5305,9 +5310,15 @@ restart: * decide which bitmap we'll update in the following loop. */ if (indexDesc->rd_indam->amsummarizing) + { attrs = &summarizedattrs; + exprattrs = &summarizedexprattrs; + } else + { attrs = &hotblockingattrs; + exprattrs = &hotblockingexprattrs; + } /* Collect simple attribute references */ for (i = 0; i < indexDesc->rd_index->indnatts; i++) @@ -5348,10 +5359,10 @@ restart: } /* Collect all attributes used in expressions, too */ - pull_varattnos(indexExpressions, 1, attrs); + pull_varattnos(indexExpressions, 1, exprattrs); /* Collect all attributes in the index predicate, too */ - pull_varattnos(indexPredicate, 1, attrs); + pull_varattnos(indexPredicate, 1, exprattrs); index_close(indexDesc, AccessShareLock); } @@ -5380,11 +5391,25 @@ restart: bms_free(pkindexattrs); bms_free(idindexattrs); bms_free(hotblockingattrs); + bms_free(hotblockingexprattrs); bms_free(summarizedattrs); + bms_free(summarizedexprattrs); goto restart; } + /* {expression-only columns} = {expression columns} - {direct columns} */ + hotblockingexprattrs = bms_del_members(hotblockingexprattrs, + hotblockingattrs); + /* {all columns} = {direct columns} + {expression-only columns} */ + hotblockingattrs = bms_add_members(hotblockingattrs, + hotblockingexprattrs); + + summarizedexprattrs = bms_del_members(summarizedexprattrs, + summarizedattrs); + summarizedattrs = bms_add_members(summarizedattrs, + summarizedexprattrs); + /* Don't leak the old values of these bitmaps, if any */ relation->rd_attrsvalid = false; bms_free(relation->rd_keyattr);