On Mon, Apr 22, 2019 at 8:36 AM Stephen Frost <sfr...@snowman.net> wrote: > This seems like it would be helpful for global indexes as well, wouldn't > it?
Yes, though that should probably work by reusing what we already do with heap TID (use standard IndexTuple fields on the leaf level for heap TID), plus an additional identifier for the partition number that is located at the physical end of the tuple. IOW, I think that this might benefit from a design that is half way between what we already do with heap TIDs and what we would be required to do to make varwidth logical row identifiers in tables work -- the partition number is varwidth, though often only a single byte. > I agree with trying to avoid having padding 'in the wrong place' and if > it makes some indexes smaller, great, even if they're unlikely to be > interesting in the vast majority of cases, they may still exist out > there. Of course, this is provided that it doesn't overly complicate > the code, but it sounds like it wouldn't be too bad in this case. Here is what it took: * Removed the "conservative" MAXALIGN() within index_form_tuple(), bringing it in line with heap_form_tuple(), which only MAXALIGN()s so that the first attribute in tuple's data area can safely be accessed on alignment-picky platforms, but doesn't do the same with data_len. * Removed most of the MAXALIGN()s from nbtinsert.c, except one that considers if a page split is required. * Didn't change the nbtsplitloc.c code, because we need to assume MAXALIGN()'d space quantities there. We continue to not trust the reported tuple length to be MAXALIGN()'d, which is now essentially rather than just defensive. * Removed MAXALIGN()s within _bt_truncate(), and SHORTALIGN()'d the whole tuple size in the case where new pivot tuple requires a heap TID representation. We access TIDs as 3 2 byte integers, so this is necessary for alignment-picky platforms. I will pursue this as a project for PostgreSQL 13. It doesn't affect on-disk compatibility, because BTreeTupleGetHeapTID() works just as well with either the existing scheme, or this new one. Having the "real" tuple length available will make it easier to implement "true" suffix truncation, where we truncate *within* a text attribute (i.e. generate a new, shorter value using new opclass infrastructure). -- Peter Geoghegan