Hi, I'm ecstatic others are now running the tests and, more importantly, that we're having the conversation.
I've become convinced we cannot always have 100% green tests. I am reminded of this [1] blog post from Google when thinking about flaky tests. The TL;DR is "flakiness happens", to the tune of about 1.5% of all tests across Google. I am in no way advocating that we simply turn a blind eye to broken or flaky tests, or shrug our shoulders and rubber stamp a vote, but instead to accept it when it reasonably applies. To achieve this, we might need to have discussion at vote/release time (if not sooner) to triage flaky tests, but I see that as a good thing. Thanks, -Jason [1] https://testing.googleblog.com/2016/05/flaky-tests-at-google-and-how-we.html On Fri, Feb 16, 2018 at 12:47 AM, Dinesh Joshi < dinesh.jo...@yahoo.com.invalid> wrote: > I'm new to this project and here are my two cents. > If there are tests that are constantly failing or flaky and you have had > releases despite their failures, then they're not useful and can be > disabled. They can always be reenabled if they are in fact valuable. Having > 100% blue dashboard is not idealistic IMHO. Hardware failures are harder > but they can be addressed too. > I could pitch in to fix the noisy tests or just help in other ways to get > the dashboard to blue. > Dinesh > On Thursday, February 15, 2018, 1:14:33 PM PST, Josh McKenzie < > jmcken...@apache.org> wrote: > > > > We’ve said in the past that we don’t release without green tests. The PMC > > gets to vote and enforce it. If you don’t vote yes without seeing the > test > > results, that enforces it. > > I think this is noble and ideal in theory. In practice, the tests take long > enough, hardware infra has proven flaky enough, and the tests *themselves* > flaky enough, that there's been a consistent low-level of test failure > noise that makes separating signal from noise in this context very time > consuming. Reference 3.11-test-all for example re:noise: > https://builds.apache.org/view/A-D/view/Cassandra/job/ > Cassandra-3.11-test-all/test/?width=1024&height=768 > > Having spearheaded burning test failures to 0 multiple times and have them > regress over time, my gut intuition is we should have one person as our > Source of Truth with a less-flaky source for release-vetting CI (dedicated > hardware, circle account, etc) we can use as a reference to vote on release > SHA's. > > We’ve declared this a requirement multiple times > > Declaring things != changed behavior, and thus != changed culture. The > culture on this project is one of having a constant low level of test > failure noise in our CI as a product of our working processes. Unless we > change those (actually block release w/out green board, actually > aggressively block merge w/any failing tests, aggressively retroactively > track down test failures on a daily basis and RCA), the situation won't > improve. Given that this is a volunteer organization / project, that kind > of daily time investment is a big ask. > > On Thu, Feb 15, 2018 at 1:10 PM, Jeff Jirsa <jji...@gmail.com> wrote: > > > Moving this to it’s own thread: > > > > We’ve declared this a requirement multiple times and then we occasionally > > get a critical issue and have to decide whether it’s worth the delay. I > > assume Jason’s earlier -1 on attempt 1 was an enforcement of that earlier > > stated goal. > > > > It’s up to the PMC. We’ve said in the past that we don’t release without > > green tests. The PMC gets to vote and enforce it. If you don’t vote yes > > without seeing the test results, that enforces it. > > > > -- > > Jeff Jirsa > > > > > > > On Feb 15, 2018, at 9:49 AM, Josh McKenzie <jmcken...@apache.org> > wrote: > > > > > > What would it take for us to get green utest/dtests as a blocking part > of > > > the release process? i.e. "for any given SHA, here's a link to the > tests > > > that passed" in the release vote email? > > > > > > That being said, +1. > > > > > >> On Wed, Feb 14, 2018 at 4:33 PM, Nate McCall <zznat...@gmail.com> > > wrote: > > >> > > >> +1 > > >> > > >> On Thu, Feb 15, 2018 at 9:40 AM, Michael Shuler < > mich...@pbandjelly.org > > > > > >> wrote: > > >>> I propose the following artifacts for release as 3.0.16. > > >>> > > >>> sha1: 890f319142ddd3cf2692ff45ff28e71001365e96 > > >>> Git: > > >>> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a= > > >> shortlog;h=refs/tags/3.0.16-tentative > > >>> Artifacts: > > >>> https://repository.apache.org/content/repositories/ > > >> orgapachecassandra-1157/org/apache/cassandra/apache-cassandra/3.0.16/ > > >>> Staging repository: > > >>> https://repository.apache.org/content/repositories/ > > >> orgapachecassandra-1157/ > > >>> > > >>> Debian and RPM packages are available here: > > >>> http://people.apache.org/~mshuler > > >>> > > >>> *** This release addresses an important fix for CASSANDRA-14092 *** > > >>> "Max ttl of 20 years will overflow localDeletionTime" > > >>> https://issues.apache.org/jira/browse/CASSANDRA-14092 > > >>> > > >>> The vote will be open for 72 hours (longer if needed). > > >>> > > >>> [1]: (CHANGES.txt) https://goo.gl/rLj59Z > > >>> [2]: (NEWS.txt) https://goo.gl/EkrT4G > > >>> > > >>> ------------------------------------------------------------ > --------- > > >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > >>> For additional commands, e-mail: dev-h...@cassandra.apache.org > > >>> > > >> > > >> --------------------------------------------------------------------- > > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > >> For additional commands, e-mail: dev-h...@cassandra.apache.org > > >> > > >> > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > >