I am trying to bring up a 6 node cluster in AWS. 3 seed nodes and 3 non-seed nodes. One of each in each availability zone with 1.2.15 and my non-seed nodes never join the cluster. If I run 1.2.14 everything works fine. We are not using vnodes and all of the initial_token values are assigned based on the Murmur3 calculations.
This isn't a data migration from a previous version. It is a completely clean cluster which I am starting from scratch. The seed nodes come up and join the cluster just fine. But none of my non-seed nodes are joining the cluster. In the logs I am seeing the following from one of my non-seed nodes. Note the repeats of the last lines that never go away. INFO 15:58:54,729 Handshaking version with /10.0.12.13 INFO 15:58:55,724 Handshaking version with /10.0.32.126 INFO 15:58:56,726 Handshaking version with /10.0.22.230 INFO 15:58:56,929 Node /10.0.32.126 is now part of the cluster INFO 15:58:56,930 InetAddress /10.0.32.126 is now UP INFO 15:58:56,957 Node /10.0.12.103 is now part of the cluster INFO 15:58:56,960 InetAddress /10.0.12.103 is now UP INFO 15:58:56,967 Node /10.0.22.206 is now part of the cluster INFO 15:58:56,968 InetAddress /10.0.22.206 is now UP INFO 15:58:56,975 Node /10.0.12.13 is now part of the cluster INFO 15:58:56,976 InetAddress /10.0.12.13 is now UP INFO 15:58:56,984 Node /10.0.22.230 is now part of the cluster INFO 15:58:56,984 InetAddress /10.0.22.230 is now UP INFO 15:58:57,010 CFS(Keyspace='system', ColumnFamily='peers') liveRatio is 12.87932647333957 (just-counted was 12.87932647333957). calculation took 19ms for 38 columns INFO 15:58:57,679 Handshaking version with /10.0.22.206 INFO 15:58:57,726 Handshaking version with /10.0.22.230 INFO 15:58:58,728 Handshaking version with /10.0.12.13 INFO 15:58:59,730 Handshaking version with /10.0.12.103 INFO 15:59:06,090 Handshaking version with /10.0.32.126 * INFO 15:59:23,932 JOINING: waiting for schema information to complete INFO 15:59:24,932 JOINING: waiting for schema information to complete INFO 15:59:25,933 JOINING: waiting for schema information to complete INFO 15:59:26,933 JOINING: waiting for schema information to complete INFO 15:59:27,934 JOINING: waiting for schema information to complete INFO 15:59:28,934 JOINING: waiting for schema information to complete INFO 15:59:29,935 JOINING: waiting for schema information to complete INFO 15:59:30,935 JOINING: waiting for schema information to complete* So I suspect it is some sort of bootstrapping issue. I checked the CHANGES.txt and noticed this for 1.2.15. *Move handling of migration event source to solve bootstrap race (CASSANDRA-6648)* I looked at 6648 and there seems, based on some of the comments that there is a lack of confidence in this problem. Has anyone else seen this problem? -- John Pyeatt Singlewire Software, LLC www.singlewire.com ------------------ 608.661.1184 john.pye...@singlewire.com