Hi, The dev list murdered my rich text formatted email. Here it is reformatted as plain text.
The unit tests are looking pretty reliable right now. There is a long tail of infrequently failing tests but it's not bad and almost all builds succeed in the current build environment. In CircleCI it seems like unit tests might be a little less reliable, but still usable. The dtests on the other hand aren't producing clean builds yetl. There is also a pretty diverse set of failing tests. I did a bit of triaging of the flakey dtests. I started by cataloging everything, but what I found is that the long tail of flakey dtests is very long indeed so I narrowed focus to just the top frequently failing tests for now. See https://goo.gl/b96CdO I created spreadsheet with some of the failing tests. Links to JIRA, last time the test was seen failing, and how many failures I found in Apache Jenkins across the 3 dtest builds. There are a lot of failures not listed. There would be 50+ entries if I cataloged each one. There are two hard failing tests, but both are already moving along: CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta reviewing, last updated April 2017) dtest failure in topology_test.TestTopology.size_estimates_multidc_test CASSANDRA-13113 (Ready to commit, assigned Alex Petrov, Sam T Reviewing, last updated March 2017) test failure in auth_test.TestAuth.system_auth_ks_is_alterable_test I think the tests we should tackle first are on this sheet in priority order https://goo.gl/S3khv1 Suite: bootstrap_test Test: TestBootstrap.simultaneous_bootstrap_test JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13506 Last failure: 5/5/2017 Counted failures: 45 Suite: repair_test Test: incremental_repair_test.TestIncRepair.compaction_test JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13194 Last failure: 5/4/2017 Counted failures: 44 Suite: sstableutil_test Test: SSTableUtilTest.compaction_test JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13182 Last failure: 5/4/2017 Counted failures: 35 Suite: paging_test Test: TestPagingWithDeletions.test_ttl_deletions JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13507 Last failure: 4/25/2017 Counted failures: 31 Suite: repair_test Test: incremental_repair_test.TestIncRepair.multiple_repair_test JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13515 Last failed: 5/4/2017 Counted failures: 18 Suite: cqlsh_tests Test: cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_* JIRA: https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22 Last failed: 5/8/2017 Counted failures: 23 Suite: paxos_tests Test: TestPaxos.contention_test_many_threads JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13517 Last failed: 5/8/2017 Counted failures: 15 Suite: repair_test Test: TestRepair JIRA: https://issues.apache.org/jira/issues/?jql=status%20%3D%20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22 Last failure: 5/4/2017 Comment: No one test fails a lot but the number of failing tests is substantial Suite: cqlsh_tests Test: cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate | test_use_keyspace | test_create_keyspace] JIRA: No JIRA yet Last failed: 4/22/2017 count: 6 If you have spare cycles you can make a huge difference in test stability by picking off one of these. Regards, Ariel On Wed, May 10, 2017, at 12:45 PM, Ariel Weisberg wrote: > Hi all, > > The unit tests are looking pretty reliable right now. There is a long > tail of infrequently failing tests but it's not bad and almost all > builds succeed in the current build environment. In CircleCI it seems > like unit tests might be a little less reliable, but still usable. > The dtests on the other hand aren't producing clean builds yetl. There > is also a pretty diverse set of failing tests. > I did a bit of triaging of the flakey dtests. I started by cataloging > everything, but what I found is that the long tail of flakey dtests is > very long indeed so I narrowed focus to just the top frequently failing > tests for now. See https://goo.gl/b96CdO > I created spreadsheet with some of the failing tests. Links to JIRA, > last time the test was seen failing, and how many failures I found in > Apache Jenkins across the 3 dtest builds. There are a lot of failures > not listed. There would be 50+ entries if I cataloged each one. > There are two hard failing tests, but both are already moving along: > CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta > reviewing, last updated April 2017) dtest failure in > topology_test.TestTopology.size_estimates_multidc_testCASSANDRA-13113 > (Ready to commit, assigned Alex Petrov, Sam T Reviewing, > last updated March 2017) test failure in > auth_test.TestAuth.system_auth_ks_is_alterable_test > I think the tests we should tackle first are on this sheet in priority > order https://goo.gl/S3khv1 > Suite Test JIRA Last failure Counted failures Status Assigned Reviewer > Comments bootstrap_test TestBootstrap.simultaneous_bootstrap_test > https://issues.apache.org/jira/browse/CASSANDRA-13506 > 5/5/2017 45 Open > > > > repair_test incremental_repair_test.TestIncRepair.compaction_test > https://issues.apache.org/jira/browse/CASSANDRA-13194 > 5/4/2017 44 Open > > > > sstableutil_test SSTableUtilTest.compaction_test > https://issues.apache.org/jira/browse/CASSANDRA-[1]13182 > 5/4/2017 35 Open > > > > paging_test TestPagingWithDeletions.test_ttl_deletions > https://issues.apache.org/jira/browse/CASSANDRA-[2]13507 > 4/25/2017 31 Open > > > > repair_test incremental_repair_test.TestIncRepair.multiple_repair_test > https://issues.apache.org/jira/browse/CASSANDRA-[3]13515 > 5/4/2017 18 Open > > > > cqlsh_tests cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_* > https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22 > 5/8/2017 23 > > > > > paxos_tests TestPaxos.contention_test_many_threads > https://issues.apache.org/jira/browse/CASSANDRA-[4]13517 > 5/8/2017 15 Open > > > > repair_test TestRepair > https://issues.apache.org/jira/issues/?jql=status%20%3D%20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22 > 5/4/2017 > > > > > No one test fails a lot but the number of failing tests is substantial > cqlsh_tests cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate | > test_use_keyspace | test_create_keyspace] > > 4/22/2017 6 > If you have spare cycles you can make a huge difference in test > stability by picking off one of these. > Regards, > Ariel > > Links: > > 1. https://issues.apache.org/jira/browse/CASSANDRA-13194 > 2. https://issues.apache.org/jira/browse/CASSANDRA-13194 > 3. https://issues.apache.org/jira/browse/CASSANDRA-13194 > 4. https://issues.apache.org/jira/browse/CASSANDRA-13194 --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org