Re: IPC/MultixactCreation on the Standby server

2025-07-21 Thread Andrey Borodin
> On 18 Jul 2025, at 18:53, Andrey Borodin wrote: > > Please find attached dirty test and a sketch of the fix. It is done against > PG 16, I wanted to ensure that problem is reproducible before 17. Here'v v7 with improved comments and cross-check for correctness. Also, MultiXact wraparound is

Re: IPC/MultixactCreation on the Standby server

2025-07-18 Thread Andrey Borodin
> On 18 Jul 2025, at 16:53, Álvaro Herrera wrote: > > Hello, > > Andrey and I discussed this on IM, and after some back and forth, he > came up with a brilliant idea: modify the WAL record for multixact > creation, so that the offset of the next multixact is transmitted and > can be replayed.

Re: IPC/MultixactCreation on the Standby server

2025-07-18 Thread Álvaro Herrera
Hello, Andrey and I discussed this on IM, and after some back and forth, he came up with a brilliant idea: modify the WAL record for multixact creation, so that the offset of the next multixact is transmitted and can be replayed. (We know it when we create each multixact, because the number of me

Re: IPC/MultixactCreation on the Standby server

2025-07-18 Thread Álvaro Herrera
On 2025-Jul-17, Andrey Borodin wrote: > Thinking more about the problem I see 3 ways to deal with this deadlock: > 1. We check for recovery conflict even in presence of > InterruptHoldoffCount. That's what patch v4 does. > 2. Teach page_collect_tuples() to do HeapTupleSatisfiesVisibility() > witho

Re: IPC/MultixactCreation on the Standby server

2025-07-17 Thread Andrey Borodin
> On 30 Jun 2025, at 15:58, Andrey Borodin wrote: > > page_collect_tuples() holds a lock on the buffer while examining tuples > visibility, having InterruptHoldoffCount > 0. Tuple visibility check might > need WAL to go on, we have to wait until some next MX be filled in. > Which might need

Re: IPC/MultixactCreation on the Standby server

2025-06-30 Thread Andrey Borodin
> On 28 Jun 2025, at 21:24, Andrey Borodin wrote: > > This seems to be fixing issue for me. ISTM I was wrong: there is a possible recovery conflict with snapshot. REDO: frame #2: 0x00010179a0c8 postgres`pg_usleep(microsec=100) at pgsleep.c:50:10 frame #3: 0x00010144c108 post

Re: IPC/MultixactCreation on the Standby server

2025-06-28 Thread Andrey Borodin
> On 28 Jun 2025, at 00:37, Andrey Borodin wrote: > > Indeed. After some experiments I could get unstable repro on my machine. I've added some logging and that's what I've found: 2025-06-28 23:03:40.598 +05 [40887] 006_MultiXact_standby.pl WARNING: Timed out: nextMXact 415832 tmpMXact 41582

Re: IPC/MultixactCreation on the Standby server

2025-06-27 Thread Andrey Borodin
> On 27 Jun 2025, at 11:41, Dmitry wrote: > > It seems that the hypothesis has not been confirmed. Indeed. For some reason your reproduction does not work for me. I tried to create a test from your workload description. PFA patch with a very dirty prototype. to run test you can run: cd con

Re: IPC/MultixactCreation on the Standby server

2025-06-26 Thread Dmitry
On 26.06.2025 19:24, Andrey Borodin wrote: If my hypothesis is correct nextMXact will precede tmpMXact. It seems that the hypothesis has not been confirmed. Attempt #1 2025-06-26 23:47:24.821 MSK [220458] WARNING:  Timed out: nextMXact 24138381 tmpMXact 24138379 2025-06-26 23:47:24.822 MSK [2

Re: IPC/MultixactCreation on the Standby server

2025-06-26 Thread Andrey Borodin
> On 26 Jun 2025, at 17:59, Andrey Borodin wrote: > > hypothesis Dmitry, can you please retry your reproduction with attached patch? It must print nextMXact and tmpMXact. If my hypothesis is correct nextMXact will precede tmpMXact. Best regards, Andrey Borodin. v2-0001-Make-next-multixac

Re: IPC/MultixactCreation on the Standby server

2025-06-26 Thread Andrey Borodin
> On 26 Jun 2025, at 14:33, Dmitry wrote: > > On 25.06.2025 16:44, Dmitry wrote: >> I will definitely try to reproduce the problem with your patch. > Hi Andrey! > > I checked with the patch, unfortunately the problem is also reproducible. > Client processes wake up after a second and try to g

Re: IPC/MultixactCreation on the Standby server

2025-06-26 Thread Dmitry
On 25.06.2025 16:44, Dmitry wrote: I will definitely try to reproduce the problem with your patch. Hi Andrey! I checked with the patch, unfortunately the problem is also reproducible. Client processes wake up after a second and try to get information about the members of the multixact again,

Re: IPC/MultixactCreation on the Standby server

2025-06-25 Thread Dmitry
On 25.06.2025 12:34, Andrey Borodin wrote: On 25 Jun 2025, at 11:11, Dmitry wrote: #6 GetMultiXactIdMembers (multi=45559845, members=0x7ffdaedc84b0, from_pgupgrade=, isLockOnly=) at /usr/src/postgresql-17-17.5-1.pgdg24.04+1/build/../src/backend/access/transam/multixact.c:1483

Re: IPC/MultixactCreation on the Standby server

2025-06-25 Thread Andrey Borodin
> On 25 Jun 2025, at 11:11, Dmitry wrote: > > #6 GetMultiXactIdMembers (multi=45559845, members=0x7ffdaedc84b0, > from_pgupgrade=, isLockOnly=) > at > /usr/src/postgresql-17-17.5-1.pgdg24.04+1/build/../src/backend/access/transam/multixact.c:1483 Hi Dmitry! This looks to be rela