Hi
On Thursday, November 26, 2020 4:29 PM Kyotaro Horiguchi <horikyota....@gmail.com> wrote: > At Thu, 26 Nov 2020 07:18:39 +0000, "osumi.takami...@fujitsu.com" > <osumi.takami...@fujitsu.com> wrote in > > The attached patch is intended to prevent a scenario that archive > > recovery hits WALs which come from wal_level=minimal and the server > > continues to work, which was discussed in the thread of [1]. > > The motivation is to protect that user ends up with both getting > > replica that could miss data and getting the server to miss data in targeted > recovery mode. > > > > About how to modify this, we reached the consensus in the thread. > > It is by changing the ereport's level from WARNING to FATAL in > CheckRequiredParameterValues(). > > > > In order to test this fix, what I did is > > 1 - get a base backup during wal_level is replica > > 2 - stop the server and change the wal_level from replica to minimal > > 3 - restart the server(to generate XLOG_PARAMETER_CHANGE) > > 4 - stop the server and make the wal_level back to replica > > 5 - start the server again > > 6 - execute archive recoveries in both cases > > (1) by editing the postgresql.conf and > > touching recovery.signal in the base backup from 1th step > > (2) by making a replica with standby.signal > > * During wal_level is replica, I enabled archive_mode in this test. > > > > First of all, I confirmed the server started up without this patch. > > After applying this safeguard patch, I checked that the server cannot > > start up any more in the scenario case. > > I checked the log and got the result below with this patch. > > > > 2020-11-26 06:49:46.003 UTC [19715] FATAL: WAL was generated with > > wal_level=minimal, data may be missing > > 2020-11-26 06:49:46.003 UTC [19715] HINT: This happens if you > temporarily set wal_level=minimal without taking a new base backup. > > > > Lastly, this should be backpatched. > > Any comments ? > > Perhaps we need the TAP test that conducts the above steps. I added the TAP tests to reproduce and share the result, using the case of 6-(1) described above. Here, I created a new file for it because the purposes of other files in src/recovery didn't match the purpose of my TAP tests perfectly. If you are dubious about this idea, please have a look at the comments in each file. When the attached patch is applied, my TAP tests are executed like other ones like below. t/018_wal_optimize.pl ................ ok t/019_replslot_limit.pl .............. ok t/020_archive_status.pl .............. ok t/021_row_visibility.pl .............. ok t/022_archive_recovery.pl ............ ok All tests successful. Also, I confirmed that there's no regression by make check-world. Any comments ? Best, Takamichi Osumi
stronger_safeguard_for_archive_recovery_v02.patch
Description: stronger_safeguard_for_archive_recovery_v02.patch