>
> 250 iterations isn't enough; I use 500 as a low water mark.

I agree that 500 iterations would be a reasonable minimum. We have seen
flaky unit tests requiring far more iterations, but that's not very common.
We could use to 500 iterations as default, and discretionary use a higher
limit in tests that are quick and might be prone to concurrency issues. I
can change the defaults on CirceCI config file if we agree to a new limit,
the current default of 100 iterations is quite arbitrary.

The test multiplexer allows to either run test individual test methods or
entire classes. It is quite frequent to see tests methods that pass
individually but fail when they are run together with the other tests in
the same class. Because of this, I think that we should always run entire
classes when repeating new or modified tests. The only exception to this
would be Python dtests, which usually are more resource intensive and not
so prone to that type of issues.

For CI on a patch, run the pre-commit suite and also run multiplexer with
> 250 runs on new, changed, or related tests to ensure not flaky


The multiplexer only allows to run a single test class per push. This is ok
for fixing existing flakies (its original purpose), and for most minor
changes, but it can be quite inconvenient for testing large patches that
add or modify many tests. For example, the patch for CEP-19 directly
modifies 31 test classes, which means 31 CircleCI config pushes. This
number can be somewhat reduced with some wildcards on the class names, but
the process is still quite inconvenient. I guess that other large patches
will find the same problem. I have plans on modifying the multiplexer to
allow specifying a list of classes per test target, so we don't have to
needlessly suffer with this.

On Mon, 26 Sept 2022 at 22:44, Brandon Williams <dri...@gmail.com> wrote:

> On Mon, Sep 26, 2022 at 1:31 PM Josh McKenzie <jmcken...@apache.org>
> wrote:
> >
> > 250 iterations isn't enough; I use 500 as a low water mark.
> >
> > Say more here. I originally had it at 500 but neither Mick nor I knew
> why and figured we could suss this out on this thread.
>
> I've seen flakies that passed with less later exhibit at that point.
>
> > This is also assuming that circle and ASF CI run the same tests, which
> > is not entirely true.
> >
> > +1: we need to fix this. My intuition is the path to getting circle-ci
> in parity on coverage is a shorter path than getting ASF CI to 3 green runs
> for GA. That consistent w/your perception as well or do you disagree?
>
> I agree that bringing parity to the coverage will be the shorter path.
>

Reply via email to