On Wed, May 5, 2021 at 10:57 AM Robert Haas <robertmh...@gmail.com> wrote: > One advantage of indirect indexes is that you can potentially avoid a > lot of writes to the index. If a non-HOT update is performed, but the > primary key is not updated, the index does not need to be touched. I > think that's a potentially significant savings, even if bottom-up > index deletion would have prevented the page splits. Similarly, you > can mark a dead line pointer unused without having to scan the > indirect index, because the index isn't pointing to that dead line > pointer anyway.
As I said, this is equivalent to solving the "TID is a stable identifier of logical row" issue (an exceptionally hard problem that I don't think is worth solving), except that you make the secondary indexes have potentially larger keys for no benefit. Sure, you can consistently refer to a logical row using its PK value (assuming you have this whole two-phase locking infrastructure), but why wouldn't you "just" solve the problem with TID directly instead? What does involving PK values actually buy you? I am pretty sure that the answer is "less than nothing". It is still true that I'm arguing against ever having a clustered index table AM, which would be somewhat useful to users (that much I'll own). The main reason for that is because we'd still be required to solve the "TID is a stable identifier of logical row" issue, except it's not a physiological TID/rowid (it's a fully logical row identifier). So everything seems to lead back to that hard problem seeming insoluble. > Hmm, but I guess you have another cleanup problem. What prevents > someone from inserting a new row with the same primary key as a > previously-deleted row but different values in some indirectly-indexed > column? Two-phase locking in indexes stops it. Note that this is pretty much what happens in Oracle -- it's not just SQL Server. This is why we have rich extensibility indexing -- indexes are strictly physical data structures in Postgres. > And, anyway, this whole argument is predicated on the fact that the > only table AM we have right now is heapam. If we had a table AM that > organized the data by primary key value, we'd still want to be able to > have secondary indexes, and they'd have to use the primary key value > as the TID. But Jeff has a design for the columnstore table AM where TIDs are essentially logical (not physiological like those of heapam), that nevertheless will work with the design around TIDs that I have in mind. "Logical identifiers" versus "Logical identifiers that stably identify logical rows" seems like a subtle but important distinction here. Of course I cannot yet rule out the possibility that this approach to TIDs will always be good enough. But it sure seems like it might be, and starting with the assumption that it is and working backwards seems like a good way to attack the problem as a practical matter. > > I think that global indexes are well worth having, and should be > > solved some completely different way. The partition key can be an > > additive thing. > > I agree that the partition identifier should be an additive thing, but > where would we add it? It seems to me that the obvious answer is to > make it a column of the index tuple. Right. > And if we can do that, why can't > we put whatever kind of TID-like stuff people want in the index tuple, > too? Maybe part of the problem here is that I don't actually > understand how posting lists are represented... You want to use the partition identifier for predicate push-down and stuff anyway, so making it part of the TID doesn't seem particularly natural to me. "Posting list splits" from the nbtree README will give you some idea of why I care about making TIDs integer-like and equi-sized within any given index tuple. There will be similar considerations for GIN. Though I think that nbtree deduplication is important enough on its own to try to preserve across table AMs. -- Peter Geoghegan