On 2/17/25 5:12 AM, Clement Verna wrote:
> 
> 
> On Sun, 16 Feb 2025 at 13:52, Zbigniew Jędrzejewski-Szmek <zbys...@in.waw.pl 
> <mailto:zbys...@in.waw.pl>> wrote:
> 
>     On Sat, Feb 15, 2025 at 11:11:49AM -0500, Dusty Mabe wrote:
>     > On 2/15/25 9:54 AM, Zbigniew Jędrzejewski-Szmek wrote:
>     > > On Fri, Feb 14, 2025 at 02:40:29PM -0800, Adam Williamson wrote:
>     > >> On Fri, 2025-02-14 at 16:31 -0500, Dusty Mabe wrote:
>     > >>> IMO the bar would only need to be that high if the user had no way 
> to ignore the test results.
>     > >>> All gating does here (IIUC) is require them to do an extra step 
> before it automatically flows
>     > >>> into the next rawhide compose.
>     > >>
>     > >> again, technically, yes, but *please* let's not train people to have 
> a
>     > >> pavlovian reaction to waive failures, that is not the way.
>     > >
>     > > IMO, the bar for *gating* tests needs to be high. I think 95% true
>     > > positives would be a reasonable threshold.
>     >
>     > I can't promise 95% true positive rate. These aren't unit tests. They 
> are system wide
>     > tests that try to test real world scenarios as much as possible. That 
> does mean pulling
>     > things from github/quay/s3/Fedora infra/etc.. and thus flakes happen. 
> Now, in our
>     > tests we do collect failures and Retry them. If a retry succeeds we 
> take it as success
>     > and never report the failure at all. However there are parts of our 
> pipeline that might
>     > not be so good at retrying.
>     >
>     > All I'm trying to say is that when you don't control everything it's 
> hard to say with
>     > confidence something will be 95%.
> 
>     As AdamW wrote in the other part of the thread, OpenQA maintains a
>     false positive rate close to 0%. So it seems possible, even with our
>     somewhat unreliable infrastructure…
> 
>     I am worried about the high failure rate for the coreos tests. But it
>     is possible that if we make them gating, the reliability will improve.
>     I know that in case of systemd, there was a failure that affected quite
>     a few of the updates because it wasn't fixed immediately. If we blocked
>     the first update, the percentage of failures would be lower. So I
>     think it makes sense to try this… If after a few months with this
>     we still have too many updates blocked by gating, we can reevaluate.
> 
> 
> I would be happy to provide a monthly report of failures so that we can 
> measure the rate of false positives. 
> 
> 
>     > As I promised before, maybe just work with us on it. These tests have 
> been enabled for
>     > a while and I've only seen a handful of package maintainers look at the 
> failures (you,
>     > Zbyszek, being one of them; thank you!).
>     >
>     > We do want them to be useful tests and I promise when a failure happens 
> because of our
>     > infra or tests themselves being flakey we try to get it fixed.
> 
>     One more question: are packagers able to restart the tests?
> 
> 
> @Dusty Mabe <mailto:du...@dustymabe.com> will know better, but I don't think 
> packagers can restart the tests currently.

No. Not currently. But it is something we could look into enabling and/or 
making easier.

Dusty
-- 
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to