> On Sep 8, 2021, at 6:44 AM, Amul Sul <sula...@gmail.com> wrote:
> 
> Here is the rebased version.

v33-0004 

This patch moves the include of "catalog/pg_control.h" from transam/xlog.c into 
access/xlog.h, making pg_control.h indirectly included from a much larger set 
of files.  Maybe that's ok.  I don't know.  But it seems you are doing this 
merely to get the symbol (not even the definition) for struct DBState.  I'd 
recommend rearranging the code so this isn't necessary, but otherwise you'd at 
least want to remove the now redundant includes of catalog/pg_control.h from 
xlogdesc.c, xloginsert.c, auth-scram.c, postmaster.c, misc/pg_controldata.c, 
and pg_controldata/pg_controldata.c.

v33-0005 

This patch makes bool XLogInsertAllowed() more complicated than before.  The 
result used to depend mostly on the value of LocalXLogInsertAllowed except that 
when that value was negative, the result was determined by 
RecoveryInProgress().  There was an arcane rule that LocalXLogInsertAllowed 
must have the non-negative values binary coercible to boolean "true" and 
"false", with the basis for that rule being the coding of XLogInsertAllowed().  
Now that the function is more complicated, this rule seems even more arcane.  
Can we change the logic to not depend on casting an integer to bool?

The code comment change in autovacuum.c introduces a non-grammatical sentence: 
"First, the system is not read only i.e. wal writes permitted".

The function comment in checkpointer.c reads more like it toggles the system 
into allowing something, rather than actually doing that same something: 
"SendSignalToCheckpointer allows a process to send a signal to the checkpoint 
process".

The new code comment in ipci.c contains a typo, but more importantly, it 
doesn't impart any knowledge beyond what a reader of the function name could 
already surmise.  Perhaps the comment can better clarify what is happening: 
"Set up wal probibit shared state"

The new code comment in sync.c copies and changes a nearby comment but drops 
part of the verb phrase:  "As in ProcessSyncRequests, we don't want to stop wal 
prohibit change requests".  The nearby comment reads "stop absorbing".  I think 
this one should read "stop processing".  This same comment is used again below. 
  Then a third comment reads "For the same reason mentioned previously for the 
wal prohibit state change request check."  That third comment is too glib.

tcop/utility.c needlessly includes "access/walprohibit.h"

wait_event.h extends enum WaitEventIO with new values 
WAIT_EVENT_WALPROHIBIT_STATE and WAIT_EVENT_WALPROHIBIT_STATE_CHANGE.  I don't 
find the difference between these two names at all clear.  Waiting for a state 
change is clear enough.  But how is waiting on a state different?

xlog.h defines a new enum.  I don't find any of it clear; not the comment, nor 
the name of the enum, nor the names of the values:

/* State of work that enables wal writes */
typedef enum XLogAcceptWritesState
{
    XLOG_ACCEPT_WRITES_PENDING = 0, /* initial state, not started */
    XLOG_ACCEPT_WRITES_SKIPPED,     /* skipped wal writes */
    XLOG_ACCEPT_WRITES_DONE         /* wal writes are enabled */
} XLogAcceptWritesState; 

This enum seems to have been written from the point of view of someone who 
already knew what it was for.  It needs to be written in a way that will be 
clear to people who have no idea what it is for.

v33-0006:

The new code comments in brin.c and elsewhere should use the verb "require" 
rather than "have", otherwise "building indexes" reads as a noun phrase rather 
than as a gerund: /* Building indexes will have an XID */

The new function CheckWALPermitted() seems to test the current state of 
variables but not lock any of them, and the new function comment says:

/*
 * In opposite to the above assertion if a transaction doesn't have valid XID
 * (e.g. VACUUM) then it won't be killed while changing the system state to WAL
 * prohibited.  Therefore, we need to explicitly error out before entering into
 * the critical section.
 */

This suggests to me that a vacuum process can check whether wal is prohibited, 
then begin a critical section which needs wal to be allowed, and concurrently 
somebody else might disable wal without killing the vacuum process.  I'm given 
to wonder what horrors await when the vacuum process does something that needs 
to be wal logged but cannot be.  Does it trigger a panic?  I don't like the 
idea that calling pg_prohibit_wal durning a vacuum might panic the cluster.  If 
there is some reason this is not a problem, I think the comment should explain 
it.  In particular, why is it sufficient to check whether wal is prohibited 
before entering the critical section and not necessary to be sure it remains 
allowed through the lifetime of that critical section?

v33-0007:

I don't really like what the documentation has to say about pg_prohibit_wal.  
Why should pg_prohibit_wal differ from other signal sending functions in 
whether it returns a boolean?  If you believe it must always succeed, you can 
still define it as returning a boolean and always return true.  That leaves the 
door open to future code changes which might need to return false for some 
reason.

But I also don't like the idea that existing transactions with xids are 
immediately killed.  Shouldn't this function take an optional timeout, perhaps 
defaulting to none, but otherwise allowing the user to put the system into 
WALPROHIBIT_STATE_GOING_READ_ONLY for a period of time before killing remaining 
transactions?

Why is this function defined to take a boolean such that pg_prohibit_wal(true) 
means to prohibit wal and pg_prohibit_wal(false) means to allow wal.  Wouldn't 
a different function named pg_allow_wal() make it more clear?  This also would 
be a better interface if taking the system read-only had a timeout as I 
suggested above, as such a timeout parameter when allowing wal is less clearly 
useful.

That's enough code review for now.  Next I will review your regression tests....

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Reply via email to