On Mon, 2025-02-17 at 11:06 +0100, Clement Verna wrote: > On Sat, 15 Feb 2025 at 20:51, Adam Williamson <adamw...@fedoraproject.org> > wrote: > > > On Sat, 2025-02-15 at 14:54 +0000, Zbigniew Jędrzejewski-Szmek wrote: > > > On Fri, Feb 14, 2025 at 02:40:29PM -0800, Adam Williamson wrote: > > > > On Fri, 2025-02-14 at 16:31 -0500, Dusty Mabe wrote: > > > > > IMO the bar would only need to be that high if the user had no way > > to ignore the test results. > > > > > All gating does here (IIUC) is require them to do an extra step > > before it automatically flows > > > > > into the next rawhide compose. > > > > > > > > again, technically, yes, but *please* let's not train people to have a > > > > pavlovian reaction to waive failures, that is not the way. > > > > > > IMO, the bar for *gating* tests needs to be high. I think 95% true > > > positives would be a reasonable threshold. > > > > Do you mean 95% of failures must be 'real' (i.e. up to 5% can be > > 'false')? Is this after automatic retries and manual intervention by > > the test system maintainers, or before? > > > > Off the top of my head, 95% seems low. I'm pretty sure we do better > > than that with openQA and people would complain if that was all we > > managed. We usually maintain a 0% false failure rate after auto-retries > > and <24h manual intervention - > > > > https://openqa.fedoraproject.org/group_overview/2?limit_builds=100&limit_builds=400 > > has 0 false failures ATM. > > > > Thanks, that's interesting. What do you call <24h manual intervention? One > example that comes to mind would be to disable or snooze a test that > started to trigger false failures in under 24h. > If that's the case, I think that sounds achievable.
We've done that very occasionally in dire emergencies (though what you'd actually want to do is disable *gating* on the test, not disable the test itself - this is an edit to the greenwave policy). Usually what it means is "rerun the test if it just flaked twice, or fix the problem if there's a specific problem causing the failure that is not a bug in the update itself". that could mean updating one of the openQA screenshots, for instance, or fixing a bug in the test logic, or working with releng/infra to fix a bug that's causing tests to fail, e.g. pagure not responding (and then rerunning all the tests that failed). If the failure is caused by a real bug in the update, we usually write up an explanation of the issue as a comment in Bodhi, or a bug report with a link from Bodhi. -- Adam Williamson (he/him/his) Fedora QA Fedora Chat: @adamwill:fedora.im | Mastodon: @ad...@fosstodon.org https://www.happyassassin.net -- _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue