On Thu, Feb 2, 2023 at 7:21 PM Takamichi Osumi (Fujitsu) <osumi.takami...@fujitsu.com> wrote: > ... > > > > Besides, I am not sure it's a stable test to check the log. Is it possible > > that there's > > no such log on a slow machine? I modified the code to sleep 1s at the > > beginning > > of apply_dispatch(), then the new added test failed because the server log > > cannot match. > To get the log by itself is necessary to ensure > that the delay is conducted by the apply worker, because we emit the diffms > only if it's bigger than 0 in maybe_apply_delay(). If we omit the step, > we are not sure the delay is caused by other reasons or the time-delayed > feature. > > As you mentioned, it's possible that no log is emitted on slow machine. Then, > the idea to make the test safer for such machines should be to make the > delayed time longer. > But we shortened the delay time to 1 second to mitigate the long test > execution time of this TAP test. > So, I'm not sure if it's a good idea to make it longer again.
I think there are a couple of things that can be done about this problem: 1. If you need the code/test to remain as-is then at least the test message could include some comforting text like "(this can fail on slow machines when the delay time is already exceeded)" so then a test failure will not cause undue alarm. 2. Try moving the DEBUG2 elog (in function maybe_apply_delay) so that it will *always* log the remaining wait time even if that wait time becomes negative. Then I think the test cases can be made deterministic instead of relying on good luck. This seems like the better option. ------ Kind Regards, Peter Smith. Fujitsu Australia