On Fri, 27 Mar 2020 at 17:54, Fujii Masao <masao.fu...@oss.nttdata.com> wrote: > > > > On 2020/03/04 14:31, Masahiko Sawada wrote: > > On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fu...@oss.nttdata.com> > > wrote: > >> > >> > >> > >> On 2020/03/04 13:27, Michael Paquier wrote: > >>> On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote: > >>>> Yeah, so 0001 patch sets existing wait events to recovery conflict > >>>> resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION) > >>>> to the recovery conflict on a snapshot. 0003 patch improves these wait > >>>> events by adding the new type of wait event such as > >>>> WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch > >>>> is the fix for existing versions and 0003 patch is an improvement for > >>>> only PG13. Did you mean even 0001 patch doesn't fit for back-patching? > >> > >> Yes, it looks like a improvement rather than bug fix. > >> > > > > Okay, understand. > > > >>> I got my eyes on this patch set. The full patch set is in my opinion > >>> just a set of improvements, and not bug fixes, so I would refrain from > >>> back-backpatching. > >> > >> I think that the issue (i.e., "waiting" is reported twice needlessly > >> in PS display) that 0002 patch tries to fix is a bug. So it should be > >> fixed even in the back branches. > > > > So we need only two patches: one fixes process title issue and another > > improve wait event. I've attached updated patches. > > I started reading > v2-0002-Improve-wait-events-for-recovery-conflict-resolut.patch. > > - ProcWaitForSignal(PG_WAIT_BUFFER_PIN); > + ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN); > > Currently the wait event indicating the wait for buffer pin has already > been reported. But the above change in the patch changes the name of > wait event for buffer pin only in the startup process. Is this really useful? > Isn't the existing wait event for buffer pin enough? > > - /* Wait to be signaled by the release of the Relation Lock */ > - ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type); > + /* Wait to be signaled by the release of the Relation Lock */ > + ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_LOCK); > > Same as above. Isn't the existing wait event enough?
Yeah, we can use the existing wait events for buffer pin and lock. > > - /* > - * Progressively increase the sleep times, but not to more than 1s, > since > - * pg_usleep isn't interruptible on some platforms. > - */ > - standbyWait_us *= 2; > - if (standbyWait_us > 1000000) > - standbyWait_us = 1000000; > + WaitLatch(MyLatch, > + WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT, > + STANDBY_WAIT_MS, > + wait_event_info); > + ResetLatch(MyLatch); > > ResetLatch() should be called before WaitLatch()? Fixed. > > Could you tell me why you dropped the "increase-sleep-times" logic? I thought we can remove it because WaitLatch is interruptible but my observation was not correct. The waiting startup process is not necessarily woken up by signal. I think it's still better to not wait more than 1 sec even if it's an interruptible wait. Attached patch fixes the above and introduces only two wait events of conflict resolution: snapshot and tablespace. I also removed the wait event of conflict resolution of database since it's unlikely to become a user-visible and a long sleep as we discussed before. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
recovery_conflict_wait_event_v3.patch
Description: Binary data