On Fri, 2019-10-18 at 18:10 +0200, Thomas Huth wrote: > Peter hit a "Could not open 'TEST_DIR/t.IMGFMT': Failed to get shared > 'write' lock - Is another process using the image > [TEST_DIR/t.IMGFMT]?" > error with 130 already twice. Looks like this test is a little bit > shaky, and currently nobody has a real clue what could be causing > this > issue, so for the time being, let's disable it from the "auto" group > so > that it does not gate the pull requests. >
For some time I've also needed to work around issues running 130. I either disabled it, or I found a few properly placed sleeps got it to reliably pass. Last week I finally got around to investigating it a bit more and discovered that the failure was related to my using --enable- membarrier in my configure. I didn't investigate whether the block io tests' _cleanup_qemu using kill -KILL was being relied on in some way by some tests, or if that is simply a way to speed the testing along, or what, but I've gotten test 130 to reliably pass by changing the test to quit properly via the monitor, and by adding a wait=1 so that _cleanup_qemu doesn't simply kill qemu. I believe 153 and 161 also suffer in a similar way. I haven't gotten around to fully understanding how qemu's using the kernel sys_membarrier is adversly affected by killing qemu in this way, but it seems there's an issue with that. Hopefully someone who is more familiar with qemu's use of membarrier's can add more details here. Bruce