On Tue, Jul 28, 2015 at 9:06 AM, Jeff Janes <jeff.ja...@gmail.com> wrote:
> On Tue, Jul 28, 2015 at 7:06 AM, Andres Freund <and...@anarazel.de> wrote: > >> Hi, >> >> On 2015-07-19 11:49:14 -0700, Jeff Janes wrote: >> > After applying this patch to commit fdf28853ae6a397497b79f, it has >> survived >> > testing long enough to convince that this fixes the problem. >> >> What was the actual workload breaking with the bug? I ran a small >> variety and I couldn't reproduce it yet. I'm not saying there's no bug, >> I just would like to be able to test my version of the fixes... >> > > It was the torn-page fault-injection code here: > > > https://drive.google.com/open?id=0Bzqrh1SO9FcEfkxFb05uQnJ2cWg0MEpmOXlhbFdyNEItNmpuek1zU2gySGF3Vk1oYXNNLUE > > It is not a minimal set, I don't know if all parts of this are necessary > to rerproduce it. The whole crash-recovery cycling might not even be > important. > I've reproduced it again against commit b2ed8edeecd715c8a23ae462. It took 5 hours on a 8 core "Intel(R) Xeon(R) CPU E5-2650". I also reproduced it in 3 hours on the same machine with both JJ_torn_page and JJ_xid set to zero (i.e. turned off, no induced crashes), so the fault-injection patch should not be necessary to get the issue.. Cheers, Jeff