Hi Divij,

I think this proposal overall makes sense. My only nit sort of a suggestion
is that let's also consider a label called newbie++[1] for flaky tests if
we are considering adding newbie as a label. I think some of the flaky
tests need familiarity with the codebase or the test setup so as a first
time contributor, it might be difficult. newbie++ IMO covers that aspect.

[1]
https://issues.apache.org/jira/browse/KAFKA-15406?jql=project%20%3D%20KAFKA%20AND%20labels%20%3D%20%22newbie%2B%2B%22

Let me know what you think.

Thanks!
Sagar.

On Mon, Nov 13, 2023 at 9:11 PM Divij Vaidya <divijvaidy...@gmail.com>
wrote:

> >  Please, do it.
> We can use specific labels to effectively filter those tickets.
>
> We already have a label and a way to discover flaky tests. They are tagged
> with the label "flaky-test" [1]. There is also a label "newbie" [2] meant
> for folks who are new to Apache Kafka code base.
> My suggestion is to send a broader email to the community (since many will
> miss details in this thread) and call for action for committers to
> volunteer as "shepherds" for these tickets. I can send one out once we have
> some consensus wrt next steps in this thread.
>
>
> [1]
>
> https://issues.apache.org/jira/browse/KAFKA-13421?jql=project%20%3D%20KAFKA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22)%20AND%20resolution%20%3D%20Unresolved%20AND%20labels%20%3D%20flaky-test%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
>
>
> [2] https://kafka.apache.org/contributing -> Finding a project to work on
>
>
> Divij Vaidya
>
>
>
> On Mon, Nov 13, 2023 at 4:24 PM Николай Ижиков <nizhi...@apache.org>
> wrote:
>
> >
> > > To kickstart this effort, we can publish a list of such tickets in the
> > community and assign one or more committers the role of a «shepherd" for
> > each ticket.
> >
> > Please, do it.
> > We can use specific label to effectively filter those tickets.
> >
> > > 13 нояб. 2023 г., в 15:16, Divij Vaidya <divijvaidy...@gmail.com>
> > написал(а):
> > >
> > > Thanks for bringing this up David.
> > >
> > > My primary concern revolves around the possibility that the currently
> > > disabled tests may remain inactive indefinitely. We currently have
> > > unresolved JIRA tickets for flaky tests that have been pending for an
> > > extended period. I am inclined to support the idea of disabling these
> > tests
> > > temporarily and merging changes only when the build is successful,
> > provided
> > > there is a clear plan for re-enabling them in the future.
> > >
> > > To address this issue, I propose the following measures:
> > >
> > > 1\ Foster a supportive environment for new contributors within the
> > > community, encouraging them to take on tickets associated with flaky
> > tests.
> > > This initiative would require individuals familiar with the relevant
> code
> > > to offer guidance to those undertaking these tasks. Committers should
> > > prioritize reviewing and addressing these tickets within their
> available
> > > bandwidth. To kickstart this effort, we can publish a list of such
> > tickets
> > > in the community and assign one or more committers the role of a
> > "shepherd"
> > > for each ticket.
> > >
> > > 2\ Implement a policy to block minor version releases until the Release
> > > Manager (RM) is satisfied that the disabled tests do not result in gaps
> > in
> > > our testing coverage. The RM may rely on Subject Matter Experts (SMEs)
> in
> > > the specific code areas to provide assurance before giving the green
> > light
> > > for a release.
> > >
> > > 3\ Set a community-wide goal for 2024 to achieve a stable Continuous
> > > Integration (CI) system. This goal should encompass projects such as
> > > refining our test suite to eliminate flakiness and addressing
> > > infrastructure issues if necessary. By publishing this goal, we create
> a
> > > shared vision for the community in 2024, fostering alignment on our
> > > objectives. This alignment will aid in prioritizing tasks for community
> > > members and guide reviewers in allocating their bandwidth effectively.
> > >
> > > --
> > > Divij Vaidya
> > >
> > >
> > >
> > > On Sun, Nov 12, 2023 at 2:58 AM Justine Olshan
> > <jols...@confluent.io.invalid>
> > > wrote:
> > >
> > >> I will say that I have also seen tests that seem to be more flaky
> > >> intermittently. It may be ok for some time and suddenly the CI is
> > >> overloaded and we see issues.
> > >> I have also seen the CI struggling with running out of space recently,
> > so I
> > >> wonder if we can also try to improve things on that front.
> > >>
> > >> FWIW, I noticed, filed, or commented on several flaky test JIRAs last
> > week.
> > >> I'm happy to try to get to green builds, but everyone needs to be on
> > board.
> > >>
> > >> https://issues.apache.org/jira/browse/KAFKA-15529
> > >> https://issues.apache.org/jira/browse/KAFKA-14806
> > >> https://issues.apache.org/jira/browse/KAFKA-14249
> > >> https://issues.apache.org/jira/browse/KAFKA-15798
> > >> https://issues.apache.org/jira/browse/KAFKA-15797
> > >> https://issues.apache.org/jira/browse/KAFKA-15690
> > >> https://issues.apache.org/jira/browse/KAFKA-15699
> > >> https://issues.apache.org/jira/browse/KAFKA-15772
> > >> https://issues.apache.org/jira/browse/KAFKA-15759
> > >> https://issues.apache.org/jira/browse/KAFKA-15760
> > >> https://issues.apache.org/jira/browse/KAFKA-15700
> > >>
> > >> I've also seen that kraft transactions tests often flakily see that
> the
> > >> producer id is not allocated and times out.
> > >> I can file a JIRA for that too.
> > >>
> > >> Hopefully this is a place we can start from.
> > >>
> > >> Justine
> > >>
> > >> On Sat, Nov 11, 2023 at 11:35 AM Ismael Juma <m...@ismaeljuma.com>
> wrote:
> > >>
> > >>> On Sat, Nov 11, 2023 at 10:32 AM John Roesler <vvcep...@apache.org>
> > >> wrote:
> > >>>
> > >>>> In other words, I’m biased to think that new flakiness indicates
> > >>>> non-deterministic bugs more often than it indicates a bad test.
> > >>>>
> > >>>
> > >>> My experience is exactly the opposite. As someone who has tracked
> many
> > of
> > >>> the flaky fixes, the vast majority of the time they are an issue with
> > the
> > >>> test.
> > >>>
> > >>> Ismael
> > >>>
> > >>
> >
> >
>

Reply via email to