@Ufuk my understanding, though never written down, was to mark test
stability issues as critical and adding the test-stability label. Maybe we
should state this somewhere more explicitly.

On Thu, Feb 28, 2019 at 1:59 PM Ufuk Celebi <u...@ververica.com> wrote:

> I fully agree with Aljoscha and Chesnay (although my recent PR
> experience was still close to what Stanislav describes).
>
> @Robert: Do we have standard labels that we apply to tickets that
> report a flaky test? I think this would be helpful to make sure that
> we have a good overview of the state of flaky tests.
>
> Best,
>
> Ufuk
>
> On Wed, Feb 27, 2019 at 3:04 PM Aljoscha Krettek <aljos...@apache.org>
> wrote:
> >
> > I agree with Chesnay, and I would like to add that the most important
> step towards fixing flakiness is awareness and willingness. As soon as you
> accept flakiness and start working around it (as you mentioned) more
> flakiness will creep in, making it harder to get rid of it in the future.
> >
> > Aljoscha
> >
> > > On 27. Feb 2019, at 12:04, Chesnay Schepler <ches...@apache.org>
> wrote:
> > >
> > > We've been in the same position a while back with the same effects. We
> solved it by creating JIRAs for every failing test and cracking down hard
> on them; I don't think there's any other way to address this.
> > > However to truly solve this one must look at the original cause to
> prevent new flaky tests from being added.
> > > From what I remember, many of our tests were flaky because they relied
> on timings (e.g. lets Thread.sleep for X and assume Y has happened) or had
> similar race-conditions, and committers nowadays are rather observant for
> these issues.
> > >
> > > By now the majority of our builds succeed.
> > > We don't to anything like running the builds multiple times before a
> merge. I know some committers always run a PR at least once against master,
> but this certainly doesn't apply to everyone.
> > > There are still tests that fail from time-to-time, but my impressions
> is that people still check which tests are failing to ensure they are
> unrelated, and track them regardless.
> > >
> > > On 26.02.2019 17:28, Stanislav Kozlovski wrote:
> > >> Hey there Flink community,
> > >>
> > >> I work on a fellow open-source project - Apache Kafka - and there we
> have been fighting flaky tests a lot. We run Java 8 and Java 11 builds on
> every Pull Request and due to test flakiness, almost all of them turn out
> red with 1 or 2 tests (completely unrelated to the change in the PR)
> failing. This has resulted in committers either ignoring them and merging
> the changes or in the worst case rerunning the hour-long build until it
> becomes green.
> > >> This test flakiness has also slowed down our releases significantly.
> > >>
> > >> In general, I was just curious to understand if this is a problem
> that your project faces as well. Does your project have a lot of
> intermittently failing tests, do you have any active process of addressing
> such tests (during the initial review, after realizing it is flaky, etc).
> Any pointers will be greatly appreciated!
> > >>
> > >> Thanks,
> > >> Stanislav
> > >>
> > >>
> > >
> >
>

Reply via email to