[
https://issues.apache.org/jira/browse/CASSANDRA-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sam Lightfoot updated CASSANDRA-21185:
--------------------------------------
Description:
Tests often failing due to no seed no being up whilst non-seeds trying to join
the ring. Likely fix to start the seed node to ensure CMS initialization is
complete then allow other nodes in CCM to start in parallel.
Affects 5.1+ due to <=5.0 using [sequential
startup|https://github.com/apache/cassandra-dtest/blob/trunk/bootstrap_test.py#L254C30-L254C32].
On further analysis it appears two separate clusters form due to the seed node
not accepting messages during CMS initialization. Attached logs show the
independent clusters resulting from
_bootstrap_test.py::TestBootstrap::test_read_from_bootstrapped_node._
was:
Tests often failing due to no seed no being up whilst non-seeds trying to join
the ring. Likely fix to start the seed node to ensure CMS initialization is
complete then allow other nodes in CCM to start in parallel.
Affects 5.1+ due to <=5.0 using [sequential
startup|https://github.com/apache/cassandra-dtest/blob/trunk/bootstrap_test.py#L254C30-L254C32].
On further analysis it appears split-brain is occurring leading to two separate
clusters being formed due to the seed node not accepting messages during CMS
initialization. Attached logs show the independent clusters resulting fromĀ
_bootstrap_test.py::TestBootstrap::test_read_from_bootstrapped_node_
> Fix flaky DTest: bootstrap_test_*
> ---------------------------------
>
> Key: CASSANDRA-21185
> URL: https://issues.apache.org/jira/browse/CASSANDRA-21185
> Project: Apache Cassandra
> Issue Type: Bug
> Components: Test/dtest/python
> Reporter: Sam Lightfoot
> Assignee: Sam Lightfoot
> Priority: Normal
> Fix For: 5.1
>
> Attachments: split_brain_logs.txt
>
>
> Tests often failing due to no seed no being up whilst non-seeds trying to
> join the ring. Likely fix to start the seed node to ensure CMS initialization
> is complete then allow other nodes in CCM to start in parallel.
> Affects 5.1+ due to <=5.0 using [sequential
> startup|https://github.com/apache/cassandra-dtest/blob/trunk/bootstrap_test.py#L254C30-L254C32].
> On further analysis it appears two separate clusters form due to the seed
> node not accepting messages during CMS initialization. Attached logs show the
> independent clusters resulting from
> _bootstrap_test.py::TestBootstrap::test_read_from_bootstrapped_node._
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]