Le lundi 30 août 2021, 09:32:05 CEST Bharath Rupireddy a écrit : > Hi, > > I see there's a scope to do following improvements to pg_receivewal.c:
Thank you Bharath for this patch. > > 1) Fetch the server system identifier in the first RunIdentifySystem > call and use it to identify(via pg_receivewal's ReceiveXlogStream) any > unexpected changes that may happen in the server while pg_receivewal > is connected to it. This can be helpful in scenarios when > pg_receivewal tries to reconnect to the server (see the code around > pg_usleep with RECONNECT_SLEEP_TIME) but something unexpected has > happnend in the server that changed the its system identifier. Once > the pg_receivewal establishes the conenction to server again, then the > ReceiveXlogStream has a code chunk to compare the system identifier > that we received in the initial connection. I'm not sure what kind of problem could be happening here: if you were somewhat routed to another server ? Or if we "switched" the cluster listening on that port ? > 2) Move the RunIdentifySystem to identify timeline id and start LSN > from the server only if the pg_receivewal failed to get them from > FindStreamingStart. This way, an extra IDENTIFY_SYSTEM command is > avoided. That makes sense, even if we add another IDENTIFY_SYSTEM to check against the one set in the first place. > 3) Place the "replication connetion shouldn't have any database name > associated" error check right after RunIdentifySystem so that we can > avoid fetching wal segment size with RetrieveWalSegSize if at all we > were to fail with that error. This change is similar to what > pg_recvlogical.c does. Makes sense. > 4) Move the RetrieveWalSegSize to just before pg_receivewal.c enters > main loop to get the wal from the server. This avoids an unnecessary > query for pg_receivewal with "--create-slot" or "--drop-slot". > 5) Have an assertion after the pg_receivewal done a good amount of > work to find start timeline and LSN might be helpful: > Assert(stream.timeline != 0 && stream.startpos != InvalidXLogRecPtr); > > Attaching a patch that does take care of above improvements. Thoughts? Overall I think it is good. However, you have some formatting issues, here it mixes tabs and spaces: + /* + * No valid data can be found in the existing output directory. + * Get start LSN position and current timeline ID from the server. + */ And here it is not formatted properly: +static char *server_sysid = NULL; > > Regards, > Bharath Rupireddy. -- Ronan Dunklau