Hi,

I'm ecstatic others are now running the tests and, more importantly, that
we're having the conversation.

I've become convinced we cannot always have 100% green tests. I am reminded
of this [1] blog post from Google when thinking about flaky tests.
The TL;DR is "flakiness happens", to the tune of about 1.5% of all tests
across Google.

I am in no way advocating that we simply turn a blind eye to broken or
flaky tests, or shrug our shoulders
and rubber stamp a vote, but instead to accept it when it reasonably
applies.
To achieve this, we might need to have discussion at vote/release time (if
not sooner) to triage flaky tests, but I see that as a good thing.

Thanks,

-Jason

[1]
https://testing.googleblog.com/2016/05/flaky-tests-at-google-and-how-we.html



On Fri, Feb 16, 2018 at 12:47 AM, Dinesh Joshi <
dinesh.jo...@yahoo.com.invalid> wrote:

> I'm new to this project and here are my two cents.
> If there are tests that are constantly failing or flaky and you have had
> releases despite their failures, then they're not useful and can be
> disabled. They can always be reenabled if they are in fact valuable. Having
> 100% blue dashboard is not idealistic IMHO. Hardware failures are harder
> but they can be addressed too.
> I could pitch in to fix the noisy tests or just help in other ways to get
> the dashboard to blue.
> Dinesh
> On Thursday, February 15, 2018, 1:14:33 PM PST, Josh McKenzie <
> jmcken...@apache.org> wrote:
>  >
> > We’ve said in the past that we don’t release without green tests. The PMC
> > gets to vote and enforce it. If you don’t vote yes without seeing the
> test
> > results, that enforces it.
>
> I think this is noble and ideal in theory. In practice, the tests take long
> enough, hardware infra has proven flaky enough, and the tests *themselves*
> flaky enough, that there's been a consistent low-level of test failure
> noise that makes separating signal from noise in this context very time
> consuming. Reference 3.11-test-all for example re:noise:
> https://builds.apache.org/view/A-D/view/Cassandra/job/
> Cassandra-3.11-test-all/test/?width=1024&height=768
>
> Having spearheaded burning test failures to 0 multiple times and have them
> regress over time, my gut intuition is we should have one person as our
> Source of Truth with a less-flaky source for release-vetting CI (dedicated
> hardware, circle account, etc) we can use as a reference to vote on release
> SHA's.
>
> We’ve declared this a requirement multiple times
>
> Declaring things != changed behavior, and thus != changed culture. The
> culture on this project is one of having a constant low level of test
> failure noise in our CI as a product of our working processes. Unless we
> change those (actually block release w/out green board, actually
> aggressively block merge w/any failing tests, aggressively retroactively
> track down test failures on a daily basis and RCA), the situation won't
> improve. Given that this is a volunteer organization / project, that kind
> of daily time investment is a big ask.
>
> On Thu, Feb 15, 2018 at 1:10 PM, Jeff Jirsa <jji...@gmail.com> wrote:
>
> > Moving this to it’s own thread:
> >
> > We’ve declared this a requirement multiple times and then we occasionally
> > get a critical issue and have to decide whether it’s worth the delay. I
> > assume Jason’s earlier -1 on attempt 1 was an enforcement of that earlier
> > stated goal.
> >
> > It’s up to the PMC. We’ve said in the past that we don’t release without
> > green tests. The PMC gets to vote and enforce it. If you don’t vote yes
> > without seeing the test results, that enforces it.
> >
> > --
> > Jeff Jirsa
> >
> >
> > > On Feb 15, 2018, at 9:49 AM, Josh McKenzie <jmcken...@apache.org>
> wrote:
> > >
> > > What would it take for us to get green utest/dtests as a blocking part
> of
> > > the release process? i.e. "for any given SHA, here's a link to the
> tests
> > > that passed" in the release vote email?
> > >
> > > That being said, +1.
> > >
> > >> On Wed, Feb 14, 2018 at 4:33 PM, Nate McCall <zznat...@gmail.com>
> > wrote:
> > >>
> > >> +1
> > >>
> > >> On Thu, Feb 15, 2018 at 9:40 AM, Michael Shuler <
> mich...@pbandjelly.org
> > >
> > >> wrote:
> > >>> I propose the following artifacts for release as 3.0.16.
> > >>>
> > >>> sha1: 890f319142ddd3cf2692ff45ff28e71001365e96
> > >>> Git:
> > >>> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
> > >> shortlog;h=refs/tags/3.0.16-tentative
> > >>> Artifacts:
> > >>> https://repository.apache.org/content/repositories/
> > >> orgapachecassandra-1157/org/apache/cassandra/apache-cassandra/3.0.16/
> > >>> Staging repository:
> > >>> https://repository.apache.org/content/repositories/
> > >> orgapachecassandra-1157/
> > >>>
> > >>> Debian and RPM packages are available here:
> > >>> http://people.apache.org/~mshuler
> > >>>
> > >>> *** This release addresses an important fix for CASSANDRA-14092 ***
> > >>>    "Max ttl of 20 years will overflow localDeletionTime"
> > >>>    https://issues.apache.org/jira/browse/CASSANDRA-14092
> > >>>
> > >>> The vote will be open for 72 hours (longer if needed).
> > >>>
> > >>> [1]: (CHANGES.txt) https://goo.gl/rLj59Z
> > >>> [2]: (NEWS.txt) https://goo.gl/EkrT4G
> > >>>
> > >>> ------------------------------------------------------------
> ---------
> > >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >>>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >>
> > >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>

Reply via email to