> On 21 Jul 2025, at 19:58, Andrey Borodin <x4...@yandex-team.ru> wrote:
>
> I'm planning to prepare tests and fixes for all supported branches
This is a status update message. I've reproduced problem on REL_13_STABLE and
verified that proposed fix works there.
Also I've discovered one more serious problem.
If a backend crashes just before WAL-logging multi, any heap tuple that uses
this multi will become unreadable. Any attempt to read it will hang forever.
I've reproduced the problem and now I'm working on scripting this scenario.
Basically, I modify code to hang forever after assigning multi number 2. Then
execute in first psql:
create table x as select i,0 v from generate_series(1,10) i;
create unique index on x(i);
\set id 1
begin;
select * from x where i = :id for no key update;
savepoint s1;
update x set v = v+1 where i = :id; -- multi 1
commit;
\set id 2
begin;
select * from x where i = :id for no key update;
savepoint s1;
update x set v = v+1 where i = :id; -- multi 2 -- will hang
commit;
Then in second psql:
create table y as select i,0 v from generate_series(1,10) i;
create unique index on y(i);
\set id 1
begin;
select * from y where i = :id for no key update;
savepoint s1;
update y set v = v+1 where i = :id;
commit;
After this I pkill -9 postgres. Recovered installation cannot execute select *
from x; because multi 1 cannot be read without recovery of multi 2 which was
never logged.
Luckily fix is the same: just restore offset of multi 2 when multi 1 is
recovered.
Best regards, Andrey Borodin.