On Thu, Sep 16, 2021 at 4:19 AM Bossart, Nathan <bossa...@amazon.com> wrote: > > On 9/15/21, 4:47 AM, "Amul Sul" <sula...@gmail.com> wrote: > > On Wed, Sep 15, 2021 at 12:52 AM Bossart, Nathan <bossa...@amazon.com> > > wrote: > >> It looks like ebdf5bf intentionally made sure that we hold > >> ControlFileLock while updating SharedRecoveryInProgress (now > >> SharedRecoveryState after 4e87c48). The thread for this change [0] > >> has some additional details. > >> > > > > Yeah, I saw that and ebdf5bf main intention was to minimize the gap > > between both of them which was quite big previously. The comments > > added by the same commit also describe the case that backends can > > write WAL and the control file is still referring not in > > DB_IN_PRODUCTION and IIUC, which seems to be acceptable. > > Then the question is what would be wrong if a process can see an > > inconsistent shared memory view for a small window? Might be > > wait-promoting might behave unexpectedly, that I have to test. > > For your proposed change, I would either leave out this particular > call site or add a "WithLock" version of the function. > > void > SetControlFileDBState(DBState state) > { > LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); > SetControlFileDBStateWithLock(state); > LWLockRelease(ControlFileLock); > } > > void > SetControlFileDBStateWithLock(DBState state) > { > Assert(LWLockHeldByMeInMode(ControlFileLock, LW_EXCLUSIVE)); > > ControlFile->state = state; > ControlFile->time = (pg_time_t) time(NULL); > UpdateControlFile(); > } >
+1, since skipping ControlFileLock for the DBState update is not the right thing, let's have two different functions as per your suggestion -- did the same in the attached version, thanks. > >> As far as the patch goes, I'm not sure why SetControlFileDBState() > >> needs to be exported, and TBH I don't know if this change is really a > >> worthwhile improvement. ISTM the main benefit is that it could help > >> avoid cases where we update the state but not the time. However, > >> there's still nothing preventing that, and I don't get the idea that > >> it was really a big problem to begin with. > >> > > > > Oh ok, I was working on a different patch[1] where I want to call this > > function from checkpointer, but I agree exporting function is not in > > the scope of this patch. > > Ah, I was missing this context. Perhaps this should be included in > the patch set for the other thread, especially if it will need to be > exported. > Ok, reverted those changes in the attached version. I have one additional concern about the way we update the control file, at every place where doing the update, we need to set control file update time explicitly, why can't the time update line be moved to UpdateControlFile() so that time gets automatically updated? Regards, Amul
From 4aa18feb38ec7a77ebc8215ea4207071ce81d9d5 Mon Sep 17 00:00:00 2001 From: Amul Sul <amul.sul@enterprisedb.com> Date: Mon, 20 Sep 2021 01:50:59 -0400 Subject: [PATCH v2] Deduplicate code updating ControleFile's DBState only. --- src/backend/access/transam/xlog.c | 48 ++++++++++++++++++++----------- 1 file changed, 31 insertions(+), 17 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index e51a7a749da..df1202e07da 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -931,6 +931,8 @@ static bool rescanLatestTimeLine(void); static void InitControlFile(uint64 sysidentifier); static void WriteControlFile(void); static void ReadControlFile(void); +static void SetControlFileDBStateWithLock(DBState state); +static void SetControlFileDBState(DBState state); static char *str_time(pg_time_t tnow); static void SetPromoteIsTriggered(void); static bool CheckForStandbyTrigger(void); @@ -4951,6 +4953,31 @@ UpdateControlFile(void) update_controlfile(DataDir, ControlFile, true); } +/* + * Useful to set only ControlFile's database state when a caller has already + * acquired exclusive ControlFileLock. + */ +static void +SetControlFileDBStateWithLock(DBState state) +{ + Assert(LWLockHeldByMeInMode(ControlFileLock, LW_EXCLUSIVE)); + + ControlFile->state = state; + ControlFile->time = (pg_time_t) time(NULL); + UpdateControlFile(); +} + +/* + * Useful to set only ControlFile's database state. + */ +static void +SetControlFileDBState(DBState state) +{ + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); + SetControlFileDBStateWithLock(state); + LWLockRelease(ControlFileLock); +} + /* * Returns the unique system identifier from control file. */ @@ -8085,14 +8112,12 @@ StartupXLOG(void) * updates to shared memory. */ LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); - ControlFile->state = DB_IN_PRODUCTION; - ControlFile->time = (pg_time_t) time(NULL); + SetControlFileDBStateWithLock(DB_IN_PRODUCTION); SpinLockAcquire(&XLogCtl->info_lck); XLogCtl->SharedRecoveryState = RECOVERY_STATE_DONE; SpinLockRelease(&XLogCtl->info_lck); - UpdateControlFile(); LWLockRelease(ControlFileLock); /* @@ -8952,13 +8977,7 @@ CreateCheckPoint(int flags) START_CRIT_SECTION(); if (shutdown) - { - LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); - ControlFile->state = DB_SHUTDOWNING; - ControlFile->time = (pg_time_t) time(NULL); - UpdateControlFile(); - LWLockRelease(ControlFileLock); - } + SetControlFileDBState(DB_SHUTDOWNING); /* * Let smgr prepare for checkpoint; this has to happen before we determine @@ -9507,13 +9526,8 @@ CreateRestartPoint(int flags) UpdateMinRecoveryPoint(InvalidXLogRecPtr, true); if (flags & CHECKPOINT_IS_SHUTDOWN) - { - LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); - ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; - ControlFile->time = (pg_time_t) time(NULL); - UpdateControlFile(); - LWLockRelease(ControlFileLock); - } + SetControlFileDBState(DB_SHUTDOWNED_IN_RECOVERY); + return false; } -- 2.18.0