I am trying to bring up a 6 node cluster in AWS. 3 seed nodes and 3
non-seed nodes. One of each in each availability zone with 1.2.15 and my
non-seed nodes never join the cluster. If I run 1.2.14 everything works
fine. We are not using vnodes and all of the initial_token values are
assigned based on the Murmur3 calculations.

This isn't a data migration from a previous version. It is a completely
clean cluster which I am starting from scratch.

The seed nodes come up and join the cluster just fine. But none of my
non-seed nodes are joining the cluster. In the logs I am seeing the
following from one of my non-seed nodes. Note the repeats of the last lines
that never go away.

 INFO 15:58:54,729 Handshaking version with /10.0.12.13
 INFO 15:58:55,724 Handshaking version with /10.0.32.126
 INFO 15:58:56,726 Handshaking version with /10.0.22.230
INFO 15:58:56,929 Node /10.0.32.126 is now part of the cluster
 INFO 15:58:56,930 InetAddress /10.0.32.126 is now UP
 INFO 15:58:56,957 Node /10.0.12.103 is now part of the cluster
 INFO 15:58:56,960 InetAddress /10.0.12.103 is now UP
 INFO 15:58:56,967 Node /10.0.22.206 is now part of the cluster
 INFO 15:58:56,968 InetAddress /10.0.22.206 is now UP
 INFO 15:58:56,975 Node /10.0.12.13 is now part of the cluster
 INFO 15:58:56,976 InetAddress /10.0.12.13 is now UP
 INFO 15:58:56,984 Node /10.0.22.230 is now part of the cluster
 INFO 15:58:56,984 InetAddress /10.0.22.230 is now UP
 INFO 15:58:57,010 CFS(Keyspace='system', ColumnFamily='peers') liveRatio
is 12.87932647333957 (just-counted was 12.87932647333957).  calculation
took 19ms for 38 columns
 INFO 15:58:57,679 Handshaking version with /10.0.22.206
 INFO 15:58:57,726 Handshaking version with /10.0.22.230
 INFO 15:58:58,728 Handshaking version with /10.0.12.13
 INFO 15:58:59,730 Handshaking version with /10.0.12.103
 INFO 15:59:06,090 Handshaking version with /10.0.32.126







* INFO 15:59:23,932 JOINING: waiting for schema information to
complete INFO 15:59:24,932 JOINING: waiting for schema information to
complete INFO 15:59:25,933 JOINING: waiting for schema information to
complete INFO 15:59:26,933 JOINING: waiting for schema information to
complete INFO 15:59:27,934 JOINING: waiting for schema information to
complete INFO 15:59:28,934 JOINING: waiting for schema information to
complete INFO 15:59:29,935 JOINING: waiting for schema information to
complete INFO 15:59:30,935 JOINING: waiting for schema information to
complete*

So I suspect it is some sort of bootstrapping issue. I checked the
CHANGES.txt and noticed this for 1.2.15.
*Move handling of migration event source to solve bootstrap race
(CASSANDRA-6648)*
I looked at 6648 and there seems, based on some of the comments that there
is a lack of confidence in this problem.

Has anyone else seen this problem?
-- 
John Pyeatt
Singlewire Software, LLC
www.singlewire.com
------------------
608.661.1184
john.pye...@singlewire.com

Reply via email to