Hi,

The dev list murdered my rich text formatted email. Here it is
reformatted as plain text.

The unit tests are looking pretty reliable right now. There is a long
tail of infrequently failing tests but it's not bad and almost all
builds succeed in the current build environment. In CircleCI it seems
like unit tests might be a little less reliable, but still usable.

The dtests on the other hand aren't producing clean builds yetl. There
is also a pretty diverse set of failing tests.

I did a bit of triaging of the flakey dtests. I started by cataloging
everything, but what I found is that the long tail of flakey dtests is
very long indeed so I narrowed focus to just the top frequently failing
tests for now. See https://goo.gl/b96CdO

I created spreadsheet with some of the failing tests. Links to JIRA,
last time the test was seen failing, and how many failures I found in
Apache Jenkins across the 3 dtest builds. There are a lot of failures
not listed. There would be 50+ entries if I cataloged each one.

There are two hard failing tests, but both are already moving along:
CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta
reviewing, last updated April 2017) dtest failure in
topology_test.TestTopology.size_estimates_multidc_test
CASSANDRA-13113 (Ready to commit, assigned Alex Petrov, Sam T Reviewing,
last updated March 2017)       test failure in
auth_test.TestAuth.system_auth_ks_is_alterable_test

I think the tests we should tackle first are on this sheet in priority
order https://goo.gl/S3khv1

Suite: bootstrap_test
Test: TestBootstrap.simultaneous_bootstrap_test
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13506
Last failure: 5/5/2017
Counted failures: 45

Suite: repair_test
Test: incremental_repair_test.TestIncRepair.compaction_test
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13194
Last failure: 5/4/2017
Counted failures: 44

Suite: sstableutil_test
Test: SSTableUtilTest.compaction_test
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13182
Last failure: 5/4/2017
Counted failures: 35

Suite: paging_test
Test: TestPagingWithDeletions.test_ttl_deletions
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13507
Last failure: 4/25/2017
Counted failures: 31

Suite: repair_test
Test: incremental_repair_test.TestIncRepair.multiple_repair_test
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13515
Last failed: 5/4/2017
Counted failures: 18

Suite: cqlsh_tests
Test: cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*
JIRA:
https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22
Last failed: 5/8/2017
Counted failures: 23

Suite: paxos_tests
Test: TestPaxos.contention_test_many_threads
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13517
Last failed: 5/8/2017
Counted failures: 15

Suite: repair_test
Test: TestRepair
JIRA:
https://issues.apache.org/jira/issues/?jql=status%20%3D%20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22
Last failure: 5/4/2017
Comment: No one test fails a lot but the number of failing tests is
substantial

Suite: cqlsh_tests
Test: cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate |
test_use_keyspace | test_create_keyspace]
JIRA: No JIRA yet
Last failed: 4/22/2017
count: 6

If you have spare cycles you can make a huge difference in test
stability by picking off one of these.

Regards,
Ariel

On Wed, May 10, 2017, at 12:45 PM, Ariel Weisberg wrote:
> Hi all,
> 
> The unit tests are looking pretty reliable right now. There is a long
> tail of infrequently failing tests but it's not bad and almost all
> builds succeed in the current build environment. In CircleCI it seems
> like unit tests might be a little less reliable, but still usable.
> The dtests on the other hand aren't producing clean builds yetl. There
> is also a pretty diverse set of failing tests.
> I did a bit of triaging of the flakey dtests. I started by cataloging
> everything, but what I found is that the long tail of flakey dtests is
> very long indeed so I narrowed focus to just the top frequently failing
> tests for now. See https://goo.gl/b96CdO
> I created spreadsheet with some of the failing tests. Links to JIRA,
> last time the test was seen failing, and how many failures I found in
> Apache Jenkins across the 3 dtest builds. There are a lot of failures
> not listed. There would be  50+ entries if I cataloged each one.
> There are two hard failing tests, but both are already moving along:
> CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta
> reviewing, last updated April 2017)  dtest failure in
> topology_test.TestTopology.size_estimates_multidc_testCASSANDRA-13113
> (Ready to commit, assigned Alex Petrov, Sam T Reviewing,
> last updated March 2017) test failure in
> auth_test.TestAuth.system_auth_ks_is_alterable_test
> I think the tests we should tackle first are on this sheet in priority
> order https://goo.gl/S3khv1
> Suite Test JIRA Last failure Counted failures Status Assigned Reviewer
> Comments bootstrap_test TestBootstrap.simultaneous_bootstrap_test
> https://issues.apache.org/jira/browse/CASSANDRA-13506
>  5/5/2017 45 Open
> 
> 
> 
> repair_test incremental_repair_test.TestIncRepair.compaction_test
> https://issues.apache.org/jira/browse/CASSANDRA-13194
>  5/4/2017 44 Open
> 
> 
> 
> sstableutil_test SSTableUtilTest.compaction_test
> https://issues.apache.org/jira/browse/CASSANDRA-[1]13182
>  5/4/2017 35 Open
> 
> 
> 
> paging_test TestPagingWithDeletions.test_ttl_deletions
> https://issues.apache.org/jira/browse/CASSANDRA-[2]13507
> 4/25/2017 31 Open
> 
> 
> 
> repair_test incremental_repair_test.TestIncRepair.multiple_repair_test
> https://issues.apache.org/jira/browse/CASSANDRA-[3]13515
>  5/4/2017 18 Open
> 
> 
> 
> cqlsh_tests cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22
>  5/8/2017 23
> 
> 
> 
> 
> paxos_tests TestPaxos.contention_test_many_threads
> https://issues.apache.org/jira/browse/CASSANDRA-[4]13517
>  5/8/2017 15 Open
> 
> 
> 
> repair_test TestRepair
> https://issues.apache.org/jira/issues/?jql=status%20%3D%20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22
>  5/4/2017
> 
> 
> 
> 
> No one test fails a lot but the number of failing tests is substantial
> cqlsh_tests cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate |
> test_use_keyspace | test_create_keyspace]
> 
> 4/22/2017 6
> If you have spare cycles you can make a huge difference in test
> stability by picking off one of these.
> Regards,
> Ariel
> 
> Links:
> 
>   1. https://issues.apache.org/jira/browse/CASSANDRA-13194
>   2. https://issues.apache.org/jira/browse/CASSANDRA-13194
>   3. https://issues.apache.org/jira/browse/CASSANDRA-13194
>   4. https://issues.apache.org/jira/browse/CASSANDRA-13194

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Reply via email to