> On Jul 30, 2020, at 1:47 PM, Andres Freund <and...@anarazel.de> wrote:
>
> Hi,
>
> On 2020-07-30 13:18:01 -0700, Mark Dilger wrote:
>> Per tuple, tuple_is_visible() potentially checks whether the xmin or xmax
>> committed via TransactionIdDidCommit. I am worried about concurrent
>> truncation of clog entries causing I/O errors on SLRU lookup when performing
>> that check. The three strategies I had for dealing with that were taking
>> the XactTruncationLock (formerly known as CLogTruncationLock, for those
>> reading this thread from the beginning), locking out vacuum, and the idea
>> upthread from Andres about setting PROC_IN_VACUUM and such. Maybe I'm being
>> dense and don't need to worry about this. But I haven't convinced myself of
>> that, yet.
>
> I think it's not at all ok to look in the procarray or clog for xids
> that are older than what you're announcing you may read. IOW I don't
> think it's OK to just ignore the problem, or try to work around it by
> holding XactTruncationLock.
The current state of the patch is that concurrent vacuums are kept out of the
table being checked by means of taking a ShareUpdateExclusive lock on the table
being checked. In response to Robert's review, I was contemplating whether
that was necessary, but you raise the interesting question of whether it is
even sufficient. The logic in verify_heapam is currently relying on the
ShareUpdateExclusive lock to prevent any of the xids in the range
relfrozenxid..nextFullXid from being invalid arguments to
TransactionIdDidCommit. Ignoring whether that is a good choice vis-a-vis
performance, is that even a valid strategy? It sounds like you are saying it
is not.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company