> On 21 Jul 2025, at 19:58, Andrey Borodin <x4...@yandex-team.ru> wrote:
> 
> I'm planning to prepare tests and fixes for all supported branches

This is a status update message. I've reproduced problem on REL_13_STABLE and 
verified that proposed fix works there.

Also I've discovered one more serious problem.
If a backend crashes just before WAL-logging multi, any heap tuple that uses 
this multi will become unreadable. Any attempt to read it will hang forever.

I've reproduced the problem and now I'm working on scripting this scenario. 
Basically, I modify code to hang forever after assigning multi number 2. Then 
execute in first psql:

create table x as select i,0 v from generate_series(1,10) i;
create unique index on x(i);

\set id 1
begin;
select * from x where i = :id for no key update;
savepoint s1;
update x set v = v+1 where i = :id; -- multi 1
commit;

\set id 2
begin;
select * from x where i = :id for no key update;
savepoint s1;
update x set v = v+1 where i = :id; -- multi 2 -- will hang
commit;

Then in second psql:

create table y as select i,0 v from generate_series(1,10) i;
create unique index on y(i);

\set id 1
begin;
select * from y where i = :id for no key update;
savepoint s1;
update y set v = v+1 where i = :id;
commit;

After this I pkill -9 postgres. Recovered installation cannot execute select * 
from x; because multi 1 cannot be read without recovery of multi 2 which was 
never logged.


Luckily fix is the same: just restore offset of multi 2 when multi 1 is 
recovered.


Best regards, Andrey Borodin.

Reply via email to