In case of soft lockups, it might be helpful from root cause analysis perspective to see if the test was still able to complete despite triggering the soft lockup warning, or if that soft lockup seems not recoverable without killing the test. For that to be possible, igt_runner should not kill the test too promptly if a soft lockup related kernel taint is detected.
On kernel taints, igt_runner now decreases per test and inactivity timeouts by a factor of 10. Let it check if the taint is caused by a soft lockup and decrease the timeouts only by the factor of 2 in those cases. Signed-off-by: Janusz Krzysztofik <janusz.krzyszto...@linux.intel.com> --- runner/executor.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/runner/executor.c b/runner/executor.c index 13180a0a46..de9d29d28d 100644 --- a/runner/executor.c +++ b/runner/executor.c @@ -871,10 +871,14 @@ static const char *need_to_timeout(struct settings *settings, if (settings->abort_mask & ABORT_TAINT && is_tainted(taints)) { /* list of timeouts that may postpone immediate kill on taint */ - if (settings->per_test_timeout || settings->inactivity_timeout) - decrease = 10; - else + if (settings->per_test_timeout || settings->inactivity_timeout) { + if (is_tainted(taints) == (1 << 9) && taints & (1 << 14)) + decrease = 2; /* only warn + soft lockup */ + else + decrease = 10; + } else { return "Killing the test because the kernel is tainted.\n"; + } } if (settings->per_test_timeout != 0 && @@ -1526,8 +1530,9 @@ static int monitor_output(pid_t child, sigfd = -1; /* we are dying, no signal handling for now */ } + igt_kernel_tainted(&taints); timeout_reason = need_to_timeout(settings, killed, - igt_kernel_tainted(&taints), + taints, igt_time_elapsed(&time_last_activity, &time_now), igt_time_elapsed(&time_last_subtest, &time_now), igt_time_elapsed(&time_killed, &time_now), -- 2.50.0