Hi all,

The unit tests are looking pretty reliable right now. There is a long
tail of infrequently failing tests but it's not bad and almost all
builds succeed in the current build environment. In CircleCI it seems
like unit tests might be a little less reliable, but still usable.
The dtests on the other hand aren't producing clean builds yetl. There
is also a pretty diverse set of failing tests.
I did a bit of triaging of the flakey dtests. I started by cataloging
everything, but what I found is that the long tail of flakey dtests is
very long indeed so I narrowed focus to just the top frequently failing
tests for now. See https://goo.gl/b96CdO
I created spreadsheet with some of the failing tests. Links to JIRA,
last time the test was seen failing, and how many failures I found in
Apache Jenkins across the 3 dtest builds. There are a lot of failures
not listed. There would be  50+ entries if I cataloged each one.
There are two hard failing tests, but both are already moving along:
CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta
reviewing, last updated April 2017)  dtest failure in
topology_test.TestTopology.size_estimates_multidc_testCASSANDRA-13113 (Ready to 
commit, assigned Alex Petrov, Sam T Reviewing,
last updated March 2017) test failure in
auth_test.TestAuth.system_auth_ks_is_alterable_test
I think the tests we should tackle first are on this sheet in priority
order https://goo.gl/S3khv1
Suite Test JIRA Last failure Counted failures Status Assigned Reviewer
Comments bootstrap_test TestBootstrap.simultaneous_bootstrap_test
https://issues.apache.org/jira/browse/CASSANDRA-13506
 5/5/2017 45 Open



repair_test incremental_repair_test.TestIncRepair.compaction_test
https://issues.apache.org/jira/browse/CASSANDRA-13194
 5/4/2017 44 Open



sstableutil_test SSTableUtilTest.compaction_test
https://issues.apache.org/jira/browse/CASSANDRA-[1]13182
 5/4/2017 35 Open



paging_test TestPagingWithDeletions.test_ttl_deletions
https://issues.apache.org/jira/browse/CASSANDRA-[2]13507
4/25/2017 31 Open



repair_test incremental_repair_test.TestIncRepair.multiple_repair_test
https://issues.apache.org/jira/browse/CASSANDRA-[3]13515
 5/4/2017 18 Open



cqlsh_tests cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*
https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22
 5/8/2017 23




paxos_tests TestPaxos.contention_test_many_threads
https://issues.apache.org/jira/browse/CASSANDRA-[4]13517
 5/8/2017 15 Open



repair_test TestRepair
https://issues.apache.org/jira/issues/?jql=status%20%3D%20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22
 5/4/2017




No one test fails a lot but the number of failing tests is substantial
cqlsh_tests cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate |
test_use_keyspace | test_create_keyspace]

4/22/2017 6
If you have spare cycles you can make a huge difference in test
stability by picking off one of these.
Regards,
Ariel

Links:

  1. https://issues.apache.org/jira/browse/CASSANDRA-13194
  2. https://issues.apache.org/jira/browse/CASSANDRA-13194
  3. https://issues.apache.org/jira/browse/CASSANDRA-13194
  4. https://issues.apache.org/jira/browse/CASSANDRA-13194

Reply via email to