[ 
https://issues.apache.org/jira/browse/CASSANDRA-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18060493#comment-18060493
 ] 

Sam Lightfoot edited comment on CASSANDRA-21185 at 2/23/26 11:14 PM:
---------------------------------------------------------------------

{code:java}
# node1 (127.0.0.1) — FIRST_CMS
Startup.java:109 - Initializing as first CMS node in a new cluster
PipelineConfigurator.java:165 - Starting listening for CQL clients on 
/127.0.0.1:9042  (20:30:22)
GossipDigestSynVerbHandler.java:70 - Cluster metadata identifier mismatch from 
/127.0.0.2:7000 2130713434!=2130713433

# node2 (127.0.0.2) — VOTE, becomes accidental CMS
Startup.java:120 - Initializing for discovery
Startup.java:221 - Got candidates: DiscoveredNodes{nodes=[/127.0.0.2, 
/127.0.0.3], kind=KNOWN_PEERS}
Election.java:111 - No previous migration detected, initiating
AbstractLocalProcessor.java:104 - Committed Initialize. New epoch is 
Epoch{epoch=2}

# node3 (127.0.0.3) — VOTE, follows node2
Startup.java:120 - Initializing for discovery
LocalLog.java:526 - Enacted PreInitialize. New tail is Epoch{epoch=1} {code}
cc [~samt].

Does it make sense to start initMessaging before initializeAsFirstCMSNode to 
prevent split-brain when nodes start simultaneously? Vote nodes cannot discover 
the seed because FIRST_CMS delays messaging until after CMS initialization, 
causing them to elect their own CMS ([commit 
sample|https://github.com/apache/cassandra/compare/trunk...samueldlightfoot:cassandra:fix-tcm-split-brain])
{code:java}
case FIRST_CMS:
    logger.info("Initializing as first CMS node in a new cluster");
    initializeAsNonCmsNode(wrapProcessor);
    initMessaging.run(); <-- Moved above initializeAsFirstCMSNode()
    initializeAsFirstCMSNode();
    break; {code}


was (Author: JIRAUSER302824):
{code:java}
# node1 (127.0.0.1) — FIRST_CMS
Startup.java:109 - Initializing as first CMS node in a new cluster
PipelineConfigurator.java:165 - Starting listening for CQL clients on 
/127.0.0.1:9042  (20:30:22)
GossipDigestSynVerbHandler.java:70 - Cluster metadata identifier mismatch from 
/127.0.0.2:7000 2130713434!=2130713433

# node2 (127.0.0.2) — VOTE, becomes accidental CMS
Startup.java:120 - Initializing for discovery
Startup.java:221 - Got candidates: DiscoveredNodes{nodes=[/127.0.0.2, 
/127.0.0.3], kind=KNOWN_PEERS}
Election.java:111 - No previous migration detected, initiating
AbstractLocalProcessor.java:104 - Committed Initialize. New epoch is 
Epoch{epoch=2}

# node3 (127.0.0.3) — VOTE, follows node2
Startup.java:120 - Initializing for discovery
LocalLog.java:526 - Enacted PreInitialize. New tail is Epoch{epoch=1} {code}
cc [~samt].

Does it make sense to start initMessaging before initializeAsFirstCMSNode to 
prevent split-brain when nodes start simultaneously? Vote nodes cannot discover 
the seed because FIRST_CMS delays messaging until after CMS initialization, 
causing them to elect their own CMS.
{code:java}
case FIRST_CMS:
    logger.info("Initializing as first CMS node in a new cluster");
    initializeAsNonCmsNode(wrapProcessor);
    initMessaging.run(); <-- Moved above initializeAsFirstCMSNode()
    initializeAsFirstCMSNode();
    break; {code}

> Fix flaky DTest: bootstrap_test_* 
> ----------------------------------
>
>                 Key: CASSANDRA-21185
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21185
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Test/dtest/python
>            Reporter: Sam Lightfoot
>            Assignee: Sam Lightfoot
>            Priority: Normal
>             Fix For: 5.1
>
>
> Tests often failing due to no seed no being up whilst non-seeds trying to 
> join the ring. Likely fix to start the seed node to ensure CMS initialization 
> is complete then allow other nodes in CCM to start in parallel.
> Affects 5.1+ due to <=5.0 using sequential startup.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to