On Wed, Mar 28, 2018 at 2:12 PM, Andres Freund <and...@anarazel.de> wrote: > How will it break it? They'll see an invalid ctid and conclude that the > tuple is dead? Without any changes that's already something that can > happen if a later tuple in the chain has been pruned away. Sure, that > code won't realize it should error out because the tuple is now in a > different partition, but neither would a infomask bit. > > I think my big problem is that I just don't see what the worst that can > happen is. We'd potentially see a broken ctid chain, something that very > commonly happens, and consider the tuple to be invisible. That seems > pretty sane behaviour for unadapted code, and not any worse than other > potential solutions.
This is more or less my feeling as well. I think it's better to conserve our limited supply of infomask bits as much as we can, and I do think that we should try to reclaimed HEAP_MOVED_IN and HEAP_MOVED_OFF in the future instead of defining the combination of the two of them to mean something now. The only scenario in which I can see this patch really leading to disaster is if there's some previous release out there where the bit pattern chosen by this patch has some other, incompatible meaning. As far as we know, that's not the case: this bit pattern was previously unused. Code seeing that bit pattern could potentially therefore (1) barf on the valid CTID, but the whole point of this is to throw an ERROR anyway, so if that happens then we're getting basically the right behavior with the wrong error message or (2) just treat it as a broken CTID link, in which case the result should be pretty much the same as if this patch hadn't been committed in the first place. Where the multixact patch really caused us a lot of trouble is that the implications weren't just for the heap itself -- the relevant SLRUs became subject to new retention requirements which in turn affected vacuum, autovacuum, and checkpoint behavior. There is no similar problem here -- the flag indicating the problematic situation, however it ends up being stored, doesn't point to any external data. Now, that doesn't mean there can't be some other kind of problem with this patch, but I don't think that we should block the patch on the theory that it might have an undiscovered problem that destroys the entire PostgreSQL ecosystem with no theory as to what that problem might actually be. Modulo implementation quality, I think the risk level of this patch is somewhat but not vastly higher than 37484ad2aacef5ec794f4dd3d5cf814475180a78, which similarly defined a previously-unused bit pattern in the tuple header. The reason I think this one might be somewhat riskier is because AFAICS it's not so easy to make sure we've found all the code, even in core, that might care, as it was in that case; and also because updates happen more than freezing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company