Hi, Am Mittwoch, den 04.11.2020, 17:48 +0900 schrieb Michael Paquier: > On Fri, Oct 30, 2020 at 11:30:28AM +0900, Michael Paquier wrote: > > Playing with dd and generating random pages, this detects random > > corruptions, making use of a wait/retry loop if a failure is detected. > > As mentioned upthread, this is a double-edged sword, increasing the > > number of retries reduces the changes of false positives, at the cost > > of making regression tests longer. This stuff uses up to 5 retries > > with 100ms of sleep for each page. (I am aware of the fact that the > > commit message of the main patch is not written yet). > > So, I have done much more testing of this patch using an instance with > a small shared buffer pool and pgbench running in parallel for having > a large eviction rate, and I cannot convince myself to do that. My > laptop got easily constrained on I/O, and within a total of 2000 base > backups or so, I have seen some 5 backup failures with a correct > detection logic.
I don't quite undestand what you mean here: how do the base backups fail, and what exactly is "correct detection logic"? Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz