On Mon, 2025-02-17 at 11:06 +0100, Clement Verna wrote:
> On Sat, 15 Feb 2025 at 20:51, Adam Williamson <adamw...@fedoraproject.org>
> wrote:
> 
> > On Sat, 2025-02-15 at 14:54 +0000, Zbigniew Jędrzejewski-Szmek wrote:
> > > On Fri, Feb 14, 2025 at 02:40:29PM -0800, Adam Williamson wrote:
> > > > On Fri, 2025-02-14 at 16:31 -0500, Dusty Mabe wrote:
> > > > > IMO the bar would only need to be that high if the user had no way
> > to ignore the test results.
> > > > > All gating does here (IIUC) is require them to do an extra step
> > before it automatically flows
> > > > > into the next rawhide compose.
> > > > 
> > > > again, technically, yes, but *please* let's not train people to have a
> > > > pavlovian reaction to waive failures, that is not the way.
> > > 
> > > IMO, the bar for *gating* tests needs to be high. I think 95% true
> > > positives would be a reasonable threshold.
> > 
> > Do you mean 95% of failures must be 'real' (i.e. up to 5% can be
> > 'false')? Is this after automatic retries and manual intervention by
> > the test system maintainers, or before?
> > 
> > Off the top of my head, 95% seems low. I'm pretty sure we do better
> > than that with openQA and people would complain if that was all we
> > managed. We usually maintain a 0% false failure rate after auto-retries
> > and <24h manual intervention -
> > 
> > https://openqa.fedoraproject.org/group_overview/2?limit_builds=100&limit_builds=400
> > has 0 false failures ATM.
> > 
> 
> Thanks, that's interesting. What do you call <24h manual intervention? One
> example that comes to mind would be to disable or snooze a test that
> started to trigger false failures in under 24h.
> If that's the case, I think that sounds achievable.

We've done that very occasionally in dire emergencies (though what
you'd actually want to do is disable *gating* on the test, not disable
the test itself - this is an edit to the greenwave policy).

Usually what it means is "rerun the test if it just flaked twice, or
fix the problem if there's a specific problem causing the failure that
is not a bug in the update itself". that could mean updating one of the
openQA screenshots, for instance, or fixing a bug in the test logic, or
working with releng/infra to fix a bug that's causing tests to fail,
e.g. pagure not responding (and then rerunning all the tests that
failed).

If the failure is caused by a real bug in the update, we usually write
up an explanation of the issue as a comment in Bodhi, or a bug report
with a link from Bodhi.
-- 
Adam Williamson (he/him/his)
Fedora QA
Fedora Chat: @adamwill:fedora.im | Mastodon: @ad...@fosstodon.org
https://www.happyassassin.net




-- 
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to