On Mon, Mar 09, 2020 at 04:47:27PM +0900, Michael Paquier wrote: > On Sat, Mar 07, 2020 at 10:46:34AM -0500, Tom Lane wrote: > > The arbitrarily-set timeouts that exist in some of the isolation tests > > are horrid kluges that have caused us lots of headaches in the past > > and no doubt will again in the future. Aside from occasionally failing > > when a machine is particularly overloaded, they cause the tests to > > take far longer than necessary on decently-fast machines. So ideally > > we'd get rid of those entirely in favor of some more-dynamic approach. > > Admittedly, I have no proposal for what that would be. But adding yet > > more ways to set a (guaranteed-to-be-wrong) timeout seems like the > > wrong direction to be going in. What's the actual need that you're > > trying to deal with? > > As a matter of fact, the buildfarm member petalura just reported a > failure with the isolation test "timeouts", the machine being > extremely slow: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura&dt=2020-03-08%2011%3A20%3A05 > > test timeouts ... FAILED 60330 ms > [...] > -step update: DELETE FROM accounts WHERE accountid = 'checking'; <waiting ...> > -step update: <... completed> > +step update: DELETE FROM accounts WHERE accountid = 'checking'; > ERROR: canceling statement due to statement timeout
Indeed. I guess we could add some kind of environment variable facility in isolationtester to let slow machine owner put a way bigger timeout without making the test super slow for everyone else, but that seems overkill for just one test, and given the other thread about deploying REL_11 build-farm client, that wouldn't be an immediate fix either.