Re: [Discuss] draft PIP for "Changes to GitHub Actions based Pulsar CI"

Lari Hotari Tue, 16 Mar 2021 01:17:12 -0700

On Tue, Mar 16, 2021 at 3:11 AM Sijie Guo <guosi...@gmail.com> wrote:


> > The prototype has demonstrated about 60% reduction in
> resource consumption.
>
> It is hard to quantify. Merging them into one large workflow can result in
> more failures. Re-running those failures can consume resources as well.
>

Yes, you are right.


>
> > Isn't it urgent to resolve it?
>
> I think we are in a stage that gives us breathing room to fix flaky tests
> and solve other problems, no?
>

I don't have access to the ASF infra-users mailing list where the resource
consumption problems have been discussed. I guess the problem isn't so bad
at the moment since it's not coming to us.
Yes, it makes sense to focus on the flaky test problem if the resource
consumption isn't a pressing problem.


> I don't mean we stop the effort here. I mean we have other enhancements
> that we can do to improve the situation.
> Once we get into a position where the flakiness is reduced, we can merge
> them into one workflow.
>

+1

Getting tests to pass with a lot of retries comes with a tradeoff. One of
the critical issues it causes is that real production issues might pass
tests and get masked as test flakiness. This causes regressions. The
tolerance of test flakiness results in more flaky tests being added to the
code base. Unless we make changes to "flaky test handling", we won't be
able to change the course.
Makes sense?

BR, Lari

Re: [Discuss] draft PIP for "Changes to GitHub Actions based Pulsar CI"

Reply via email to