On Tue, Jul 17, 2018 at 10:42 PM, Simon Riggs <si...@2ndquadrant.com> wrote: > If we knew that we were never going to do deletes/non-HOT updates from > the table we could continue to use the existing mechanism, otherwise > we will be better off to use sorted index entries. However, it does > appear as if keeping entries in sorted order would be a win on > concurrency from reduced block contention on the first few blocks of > the index key, so it may also be a win in cases where there are heavy > concurrent inserts but no deletes.
I think so too. I also expect a big reduction in the number of FPIs in the event of many duplicates. > I hope we can see a patch that just adds the sorting-by-TID property > so we can evaluate that aspect before we try to add other more > advanced index ideas. I can certainly see why that's desirable. Unfortunately, it isn't that simple. If I want to sort on heap TID as a tie-breaker, I cannot cut any corners. That is, it has to be just another column, at least as far as the implementation is concerned (heap TID will have a different representation in internal pages and leaf high keys, but nonetheless it's essentially just another column in the keyspace). This means that if I don't have suffix truncation, I'll regress performance in many important cases that have no compensating benefit (e.g. pgbench). There is no point in trying to assess that. It is true that I could opt to only "logically truncate" the heap TID attribute during a leaf page split (i.e. there'd only be "logical truncation", which is to say there'd only be the avoidance of adding a heap TID to the new high key, and never true physical truncation of user attributes). But doing only that much saves very little code, since the logic for assessing whether or not it's safe to avoid adding a new heap attribute (whether or not we logically truncate) still has to involve an insertion scankey. It seems much more natural to do everything at once. Again, the heap TID attribute is more or less just another attribute. Also, the code for doing physical suffix truncation already exists from the covering/INCLUDE index commit. I'm currently improving the logic for picking a page split in light of suffix truncation, which I've been working on for weeks now. I had something that did quite well with the initial index sizes for TPC-C and TPC-H, but then realized I'd totally regressed the motivating example with many duplicates that I started this thread with. I now have something that does both things well, which I'm trying to simplify. Another thing to bear in mind is that page split logic for suffix truncation also helps space utilization on the leaf level. I can get the main TPC-C order_line pkey about 7% smaller with true suffix truncation, even though the internal page index tuples can never be any smaller due to alignment, and even though there are no duplicates that would otherwise make the implementation "get tired". Can I really fix space utilization in a piecemeal fashion? I strongly doubt it. -- Peter Geoghegan