On Sat, Feb 15, 2025 at 11:11:49AM -0500, Dusty Mabe wrote:
> On 2/15/25 9:54 AM, Zbigniew Jędrzejewski-Szmek wrote:
> > On Fri, Feb 14, 2025 at 02:40:29PM -0800, Adam Williamson wrote:
> >> On Fri, 2025-02-14 at 16:31 -0500, Dusty Mabe wrote:
> >>> IMO the bar would only need to be that high if the user had no way to 
> >>> ignore the test results.
> >>> All gating does here (IIUC) is require them to do an extra step before it 
> >>> automatically flows
> >>> into the next rawhide compose.
> >>
> >> again, technically, yes, but *please* let's not train people to have a
> >> pavlovian reaction to waive failures, that is not the way.
> > 
> > IMO, the bar for *gating* tests needs to be high. I think 95% true
> > positives would be a reasonable threshold.
> 
> I can't promise 95% true positive rate. These aren't unit tests. They are 
> system wide
> tests that try to test real world scenarios as much as possible. That does 
> mean pulling
> things from github/quay/s3/Fedora infra/etc.. and thus flakes happen. Now, in 
> our
> tests we do collect failures and Retry them. If a retry succeeds we take it 
> as success
> and never report the failure at all. However there are parts of our pipeline 
> that might
> not be so good at retrying.
> 
> All I'm trying to say is that when you don't control everything it's hard to 
> say with
> confidence something will be 95%.

As AdamW wrote in the other part of the thread, OpenQA maintains a
false positive rate close to 0%. So it seems possible, even with our
somewhat unreliable infrastructure…

I am worried about the high failure rate for the coreos tests. But it
is possible that if we make them gating, the reliability will improve.
I know that in case of systemd, there was a failure that affected quite
a few of the updates because it wasn't fixed immediately. If we blocked
the first update, the percentage of failures would be lower. So I
think it makes sense to try this… If after a few months with this
we still have too many updates blocked by gating, we can reevaluate.

> As I promised before, maybe just work with us on it. These tests have been 
> enabled for
> a while and I've only seen a handful of package maintainers look at the 
> failures (you,
> Zbyszek, being one of them; thank you!).
> 
> We do want them to be useful tests and I promise when a failure happens 
> because of our
> infra or tests themselves being flakey we try to get it fixed.

One more question: are packagers able to restart the tests?

Zbyszek
-- 
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to