Hi,

On 2020-04-01 13:27:56 -0400, Robert Haas wrote:
> Perhaps "irresponsible" is the wrong word, but it's certainly caused
> problems for multiple EnterpriseDB customers, and in my view, those
> problems weren't necessary. Either a WARNING or an ERROR would have
> shown up in the log, but an ERROR terminates VACUUM for that table and
> thus basically causes autovacuum to be completely broken. That is a
> really big problem. Perhaps you will want to argue, as Andres did,
> that the value of having ERROR rather than WARNING in the log
> justifies that outcome, but I sure don't agree.

If that had been a really viable option, I would have done so. At the
very least in the back branches, but quite possibly also in master. Or
if somebody had brought them up as an issue at the time.

What is heap_prepare_freeze_tuple/FreezeMultiXactId supposed to do after
issuing a WARNING in these cases. Without the ERROR, e.g.,
                        if (!TransactionIdDidCommit(xid))
                                ereport(ERROR,
                                                
(errcode(ERRCODE_DATA_CORRUPTED),
                                                 errmsg_internal("uncommitted 
xmin %u from before xid cutoff %u needs to be frozen",
                                                                                
 xid, cutoff_xid)));
would make a deleted tuple visible.


                if (TransactionIdPrecedes(xid, relfrozenxid))
                        ereport(ERROR,
                                        (errcode(ERRCODE_DATA_CORRUPTED),
                                         errmsg_internal("found xmin %u from 
before relfrozenxid %u",
                                                                         xid, 
relfrozenxid)));
would go on replace xmin of a potentially uncommitted tuple with
relfrozenxid, making it appear visible.


                if (TransactionIdPrecedes(xid, relfrozenxid))
                        ereport(ERROR,
                                        (errcode(ERRCODE_DATA_CORRUPTED),
                                         errmsg_internal("found xmax %u from 
before relfrozenxid %u",
                                                                         xid, 
relfrozenxid)));
would replace the xmax indicating a potentially deleted tuple with ?, either
making the tuple become, potentially wrongly, visible/invisible

or
        else if (MultiXactIdPrecedes(multi, relminmxid))
                ereport(ERROR,
                                (errcode(ERRCODE_DATA_CORRUPTED),
                                 errmsg_internal("found multixact %u from 
before relminmxid %u",
                                                                 multi, 
relminmxid)));
or ...


Just continuing is easier said than done. Especially with the background
of knowing that several users had hit the bug that allowed all of the
above to be hit, and that advancing relfrozenxid further would make it
worse.

Greetings,

Andres Freund


Reply via email to