On Mon, Apr 4, 2022 at 4:51 AM Daniel Shelepanov <deniel1...@mail.ru> wrote:
> I’ve been working on this 
> [https://www.postgresql.org/message-id/flat/cfcca574-6967-c5ab-7dc3-2c82b6723b99%40mail.ru]
>  bug. Finally, I’ve come up with the patch you can find attached. Basically 
> what is does is rises a PROC_IN_VACUUM flag and resets it afterwards. I know 
> this seems kinda crunchy and I hope you guys will give me some hints on where 
> to continue. This 
> [https://www.postgresql.org/message-id/20220218175119.7hwv7ksamfjwijbx%40alap3.anarazel.de]
>  message contains reproduction script. Thank you very much in advance.

I noticed the CommitFest entry for this thread today and decided to
take a look. I think the general issue here can be stated in this way:
suppose a VACUUM computes an all-visible cutoff X, i.e. it thinks all
committed XIDs < X are all-visible. Then, at a later time, pg_visible
computes an all-visible cutoff Y, i.e. it thinks all committed XIDs <
Y are all-visible. If Y < X, pg_check_visible() might falsely report
corruption, because VACUUM might have marked as all-visible some page
containing tuples which pg_check_visibile() thinks aren't really
all-visible.

In reality, the oldest all-visible XID cannot move backward, but
ComputeXidHorizons() lets it move backward, because it's intended for
use by a caller who wants to mark pages all-visible, and it's only
concerned with making sure that the value is old enough to be safe.
And that's a problem for the way that pg_visibility is (mis-)using it.

To say that another way, ComputeXidHorizons() is perfectly fine with
returning a value that is older than the true answer, as long as it
never returns a value that is newer than the new answer. pg_visibility
wants the opposite. Here, a value that is newer than the true value
can't do worse than hide corruption, which is sort of OK, but a value
that's older than the true value can report corruption where none
exists, which is very bad.

I have a feeling, therefore, that this isn't really a complete fix. I
think it might address one way for the horizon reported by
ComputeXidHorizons() to move backward, but not all the ways.

Unfortunately, I am out of time for today to study this... but will
try to find more time on another day.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


Reply via email to