Keeping trunk green at all times is a great goal to strive for, I'd love to
continue to work towards it, but in my experience its not easy. Flaky
tests, for the reason folks mentioned, are a real challenge. A standard we
could use while we work towards the more ambitious one, and we are pretty
close to using already as Josh mentioned, that I've seen work well is
multiple successive green runs (ideally on different platforms as well)
before a certain release and better visibility/documentation of test runs &
flakiness.

We can make incremental improvements towards this! Some I've heard on this
thread or am personally interested in are below. I think even making one or
two of these changes would be an improvement.

- A regularly run / on commit trunk build, visible to the public, should
give us more visibility into test health vs. todays status quo of having to
search the CI history of different branches.

- A process of documenting known flaky tests like a JIRA and maybe an
annotation (or just a comment) that references that JIRA (not that runs the
test multiple times to ask flakiness). Those JIRAs can be assigned to
specific releases in the current cycle like we have been doing for 4.0.
This could be paired w/ making it explicit when in the release cycle its ok
to merge w/ flaky tests (if they are documented).

- Surfacing CI results on JIRA when CI is triggered (manually or
automatically) makes it easier for reviewers and checking history at a
later date.

- Running CI automatically for contributions that the ASF says its ok for
-- as David said, other projects seem to make this work and it doesn't seem
to be an insurmountable problem since the list of signed ICLA users is
known & the GitHub API is powerful.

- Automatically transitioning JIRAs to Patch Available when the PR method
is used to open a ticket (don't know if this is possible, currently it adds
the pull-request-available label)

Jordan


On Fri, Jan 24, 2020 at 9:30 AM Joshua McKenzie <jmcken...@apache.org>
wrote:

> >
> > I also don't think it leads to the right behaviour or incentives.
>
> The gap between when a test is authored and the point at which it's
> determined to be flaky, as the difficulty with responsibility assignment
> (an "unrelated" change can in some cases make a previously stable test
> become flaky) makes this a real devil of a problem to fix. Hence it's long
> and rich legacy. ;)
>
> While I agree with the general sentiment of "if we email the dev list with
> a failure, or we git blame a test and poke the author to fix it they'll do
> the right thing", we still end up in cases where people have rotated off
> the project and nobody feels a sense of ownership over a test failure for
> something someone else wrote, or a circumstance in which another change
> broke something, etc. At least from where I sit, I can't see a solution to
> this problem that doesn't involve some collective action for things not
> directly under one's purview.
>
> Also, fwiw in my experience, "soft" gatekeeping for things like this will
> just lead to the problem persisting into perpetuity. The problem strikes me
> as too complex and temporally / unpredictably distributed to be solvable by
> incentivizing the "right" behavior (proactive prevention of introduction of
> things like this, hygiene and rigor on authorship, etc), but I'm sure
> there's ways of approaching this that I'm not thinking of.
>
> But maybe I'm making a mountain out of a molehill. @bes - if you think that
> emailing the dev list when a failure is encountered on rotation would be
> sufficient to keep this problem under control with an obviously much
> lighter touch, I'm +1 for giving it a shot.
>
> On Fri, Jan 24, 2020 at 10:12 AM Benedict Elliott Smith <
> bened...@apache.org>
> wrote:
>
> > > due to oversight on a commit or a delta breaking some test the author
> > thinks is unrelated to their diff but turns out to be a second-order
> > consequence of their change that they didn't expect
> >
> > In my opinion/experience, this is all a direct consequence of lack of
> > trust in CI caused by flakiness.  We have finite time to dedicate to our
> > jobs, and figuring out whether or not a run is really clean for this
> patch
> > is genuinely costly when  you cannot trust the result,  Those costs
> > multiple rapidly across the contributor base.
> >
> > That does not conflict with what you are saying.  I don't, however, think
> > it is reasonable to place the burden on the person trying to commit at
> that
> > moment, whether or not by positive sentiment or "computer says no".  I
> also
> > don't think it leads to the right behaviour or incentives.
> >
> > I further think there's been a degradation of community behaviour to some
> > extent caused by the bifurcation of CI infrastructure and approach.
> > Ideally we would all use a common platform, and there would be regular
> > trunk runs to compare against, like-for-like.
> >
> > IMO, we should email dev@ if there are failing runs for trunk, and there
> > should be a rotating role amongst the contributors to figure out who
> broke
> > it, and poke them to fix it (or to just fix it, if easy).
> >
> >
> > On 24/01/2020, 14:57, "Joshua McKenzie" <jmcken...@apache.org> wrote:
> >
> >     >
> >     > gating PRs on clean runs won’t achieve anything other than dealing
> > with
> >     > folks who straight up ignore the spirit of the policy and knowingly
> > commit
> >     > code with test breakage
> >
> >     I think there's some nuance here. We have a lot of suites (novnode,
> > cdc,
> >     etc etc) where failures show up because people didn't run those tests
> > or
> >     didn't think to check them when they did. Likewise, I'd posit we
> have a
> >     non-trivial number of failures (>= 15%? making up a number here) that
> > are
> >     due to oversight on a commit or a delta breaking some test the author
> >     thinks is unrelated to their diff but turns out to be a second-order
> >     consequence of their change that they didn't expect. I'm certainly
> not
> >     claiming we have bad actors here merging known test failures because
> > they
> >     don't care.
> >
> >     This seems to me like an issue of collective ownership, and whether
> or
> > not
> >     we're willing to take it as a project. If I have a patch, I run CI,
> > and a
> >     test fails that's unrelated to my diff (or I think is, or conclude
> it's
> >     unrelated after inspection, whatever), that's the crucial moment
> where
> > we
> >     can either say "Welp, not my problem. Merge time.", or say "hey, we
> all
> >     live in this neighborhood together and while this trash on the ground
> > isn't
> >     actually mine, it's my neighborhood so if I clean this up it'll
> > benefit me
> >     and all the rest of us."
> >
> >     Depending on how much time and energy fixing a flake like that may
> > take,
> >     this may prove to be economically unsustainable for some/many
> > participants
> >     on the project. A lot of us are paid to work on C* by organizations
> > with
> >     specific priorities for the project that are not directly related to
> > "has
> >     green test board". But I do feel comfortable making the case that
> > there's a
> >     world in which "don't merge if any tests fail, clean up whatever
> > failures
> >     you run into" *could* be a sustainable model assuming everyone in the
> >     ecosystem was willing and able to engage in that collectively
> > benefiting
> >     behavior.
> >
> >     Does the above make sense?
> >
> >     On Fri, Jan 24, 2020 at 7:39 AM Aleksey Yeshchenko
> >     <alek...@apple.com.invalid> wrote:
> >
> >     > As for GH for code review, I find that it works very well for nits.
> > It’s
> >     > also great for doc changes, given how GH allows you suggest changes
> > to
> >     > files in-place and automatically create PRs for those changes. That
> > lowers
> >     > the barrier for those tiny contributions.
> >     >
> >     > For anything relatively substantial, I vastly prefer to summarise
> my
> >     > feedback (and see others’ feedback summarised) in JIRA comments -
> an
> >     > opinion I and other contributors have shared in one or two similar
> > threads
> >     > over the years.
> >     >
> >     >
> >     > > On 24 Jan 2020, at 12:21, Aleksey Yeshchenko
> > <alek...@apple.com.INVALID>
> >     > wrote:
> >     > >
> >     > > The person introducing flakiness to a test will almost always
> have
> > run
> >     > it locally and on CI first with success. It’s usually later when
> > they first
> >     > start failing, and it’s often tricky to attribute to a particular
> >     > commit/person.
> >     > >
> >     > > So long as we have these - and we’ve had flaky tests for as long
> > as C*
> >     > has existed - the problem will persist, and gating PRs on clean
> runs
> > won’t
> >     > achieve anything other than dealing with folks who straight up
> > ignore the
> >     > spirit of the policy and knowingly commit code with test breakage
> > that can
> >     > be attributed to their change. I’m not aware of such committers in
> > this
> >     > community, however.
> >     > >
> >     > >> On 24 Jan 2020, at 09:01, Benedict Elliott Smith <
> > bened...@apache.org>
> >     > wrote:
> >     > >>
> >     > >>> I find it only useful for nits, or for coaching-level comments
> > that I
> >     > would never want propagated to Jira.
> >     > >>
> >     > >> Actually, I'll go one step further. GitHub encourages comments
> > that are
> >     > too trivial, poisoning the well for third parties trying to find
> > useful
> >     > information.  If the comment wouldn't be made in Jira, it probably
> >     > shouldn't be made.
> >     > >>
> >     > >>
> >     > >>
> >     > >> On 24/01/2020, 08:56, "Benedict Elliott Smith" <
> > bened...@apache.org>
> >     > wrote:
> >     > >>
> >     > >>   The common factor is flaky tests, not people.  You get a clean
> > run,
> >     > you commit.  Turns out, a test was flaky.  This reduces trust in
> CI,
> > so
> >     > people commit without looking as closely at results.  Gating on
> > clean tests
> >     > doesn't help, as you run until you're clean.  Rinse and repeat.
> > Breakages
> >     > accumulate.
> >     > >>
> >     > >>   This is what happens leading up to every release - nobody
> > commits
> >     > knowing there's a breakage.  We have a problem with bad tests, not
> > bad
> >     > people or process.
> >     > >>
> >     > >>   FWIW, I no longer like the GitHub workflow.  I find it only
> > useful
> >     > for nits, or for coaching-level comments that I would never want
> > propagated
> >     > to Jira.  I find a strong patch submission of any size is better
> > managed
> >     > with human-curated Jira comments, and provides a better permanent
> > record.
> >     > When skimming a discussion, Jira is more informative than GitHub.
> > Even
> >     > with the GitHub UX, the context hinders rather than helps.
> >     > >>
> >     > >>   As to propagating to Jira: has anyone here ever read them?  I
> > haven't
> >     > as they're impenetrable; ugly and almost entirely noise.  If
> > anything, I
> >     > would prefer that we discourage GitHub for review as a project, not
> > move
> >     > towards it.
> >     > >>
> >     > >>   This is without getting into the problem of multiple branch
> PRs.
> >     > Until this is _provably_ painless, we cannot introduce a workflow
> > that
> >     > requires it and blocks commit on it.  Working with multiple
> branches
> > is
> >     > difficult enough already, surely?
> >     > >>
> >     > >>
> >     > >>
> >     > >>   On 24/01/2020, 03:16, "Jeff Jirsa" <jji...@gmail.com> wrote:
> >     > >>
> >     > >>       100% agree
> >     > >>
> >     > >>       François and team wrote a doc on testing and gating
> commits
> >     > >>       Blake wrote a doc on testing and gating commits
> >     > >>       Every release there’s a thread on testing and gating
> commits
> >     > >>
> >     > >>       People are the common factor every time. Nobody wants to
> > avoid
> >     > merging their patch because someone broke a test elsewhere.
> >     > >>
> >     > >>       We can’t block merging technically with the repo as it
> > exists now
> >     > so it’s always going to come down to people and peer pressure,
> until
> > or
> >     > unless someone starts reverting commits that break tests
> >     > >>
> >     > >>       (Of course, someone could write a tool that automatically
> > reverts
> >     > new commits as long as tests fail....)
> >     > >>
> >     > >>       On Jan 23, 2020, at 5:54 PM, Joshua McKenzie <
> >     > jmcken...@apache.org> wrote:
> >     > >>>
> >     > >>> 
> >     > >>>>
> >     > >>>>
> >     > >>>> I am reacting to what I currently see
> >     > >>>> happening in the project; tests fail as the norm and this is
> > kinda
> >     > seen as
> >     > >>>> expected, even though it goes against the policies as I
> > understand it.
> >     > >>>
> >     > >>> After over half a decade seeing us all continue to struggle
> with
> > this
> >     > >>> problem, I've come around to the school of "apply pain" (I mean
> > that as
> >     > >>> light-hearted as you can take it) when there's a failure to
> > incent
> >     > fixing;
> >     > >>> specifically in this case, the only idea I can think of is
> > preventing
> >     > merge
> >     > >>> w/any failing tests on a PR. We go through this cycle as we
> > approach
> >     > each
> >     > >>> major release: we have the gatekeeper of "we're not going to
> cut
> > a
> >     > release
> >     > >>> with failing tests obviously", and we clean them up. After the
> >     > release, the
> >     > >>> pressure is off, we exhale, relax, and flaky test failures (and
> > others)
> >     > >>> start to creep back in.
> >     > >>>
> >     > >>> If the status quo is the world we want to live in, that's
> > totally fine
> >     > and
> >     > >>> no judgement intended - we can build tooling around test
> failure
> >     > history
> >     > >>> and known flaky tests etc to optimize engineer workflows around
> > that
> >     > >>> expectation. But what I keep seeing on threads like this (and
> > have
> >     > always
> >     > >>> heard brought up in conversation) is that our collective
> *moral*
> >     > stance is
> >     > >>> that we should have green test boards and not merge code if it
> >     > introduces
> >     > >>> failing tests.
> >     > >>>
> >     > >>> Not looking to prescribe or recommend anything, just hoping
> that
> >     > >>> observation above might be of interest or value to the
> > conversation.
> >     > >>>
> >     > >>>> On Thu, Jan 23, 2020 at 4:17 PM Michael Shuler <
> >     > mich...@pbandjelly.org>
> >     > >>>> wrote:
> >     > >>>>
> >     > >>>>> On 1/23/20 3:53 PM, David Capwell wrote:
> >     > >>>>>
> >     > >>>>> 2) Nightly build email to dev@?
> >     > >>>>
> >     > >>>> Nope. builds@c.a.o is where these go.
> >     > >>>>
> https://lists.apache.org/list.html?bui...@cassandra.apache.org
> >     > >>>>
> >     > >>>> Michael
> >     > >>>>
> >     > >>>>
> > ---------------------------------------------------------------------
> >     > >>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >     > >>>> For additional commands, e-mail:
> dev-h...@cassandra.apache.org
> >     > >>>>
> >     > >>>>
> >     > >>
> >     > >>
> >     >
> > ---------------------------------------------------------------------
> >     > >>       To unsubscribe, e-mail:
> > dev-unsubscr...@cassandra.apache.org
> >     > >>       For additional commands, e-mail:
> > dev-h...@cassandra.apache.org
> >     > >>
> >     > >>
> >     > >>
> >     > >>
> >     > >>
> >     > >>
> >  ---------------------------------------------------------------------
> >     > >>   To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >     > >>   For additional commands, e-mail:
> dev-h...@cassandra.apache.org
> >     > >>
> >     > >>
> >     > >>
> >     > >>
> >     > >>
> >     > >>
> > ---------------------------------------------------------------------
> >     > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >     > >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >     > >>
> >     > >
> >     > >
> >     > >
> > ---------------------------------------------------------------------
> >     > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >     > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >     > >
> >     >
> >     >
> >     >
> ---------------------------------------------------------------------
> >     > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >     > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >     >
> >     >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>

Reply via email to