Hi All, I'm having some major issues bootstrapping a new node to my cluster. We are running 1.2.16, with vnodes enabled.
When a new node starts up (with auto_bootstrap), it selects a host ID and finds the ring successfully: INFO 18:42:29,559 JOINING: waiting for ring information It successfully selects a set of tokens. Then the weird stuff begins. I get this error once, while the node is reading the system keyspace: ERROR 18:42:32,921 Exception in thread Thread[InternalResponseStage:1,5,main] java.lang.NullPointerException at org.apache.cassandra.utils.ByteBufferUtil.toLong(ByteBufferUtil.java:421) at org.apache.cassandra.cql.jdbc.JdbcLong.compose(JdbcLong.java:94) at org.apache.cassandra.db.marshal.LongType.compose(LongType.java:34) at org.apache.cassandra .cql3.UntypedResultSet$Row.getLong(UntypedResultSet.java:138) at org.apache.cassandra.db.SystemTable.migrateKeyAlias(SystemTable.java:199) at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:346) at org.apache.cassandra .service.MigrationTask$1.response(MigrationTask.java:66) at org.apache.cassandra .net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:47) at org.apache.cassandra .net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) But it doesn't stop the bootstrap process. The node successfully handshakes versions, and pauses before bootstrapping: INFO 18:42:59,564 JOINING: schema complete, ready to bootstrap INFO 18:42:59,565 JOINING: waiting for pending range calculation INFO 18:42:59,565 JOINING: calculation complete, ready to bootstrap INFO 18:42:59,565 JOINING: getting bootstrap token INFO 18:42:59,705 JOINING: sleeping 30000 ms for pending range setup After 30 seconds, I get a flood of endless org.apache.cassandra.db.UnknownColumnFamilyException errors, and all other nodes in the cluster log the following endlessly: INFO [HANDSHAKE-/x.x.x.x] 2014-05-09 18:44:36,289 OutboundTcpConnection.java (line 418) Handshaking version with /x.x.x.x I suspect there may be something wrong with my schemas. Sometimes while restarting an existing node, the node will fail to restart, with the following error, again while reading the system keyspace: ERROR [InternalResponseStage:5] 2014-05-05 23:56:03,786 CassandraDaemon.java (line 191) Exception in thread Thread[InternalResponseStage:5,5,main] org.apache.cassandra.db.marshal.MarshalException: cannot parse 'column1' as hex bytes at org.apache.cassandra .db.marshal.BytesType.fromString(BytesType.java:69) at org.apache.cassandra .config.ColumnDefinition.fromSchema(ColumnDefinition.java:231) at org.apache.cassandra .config.CFMetaData.addColumnDefinitionSchema(CFMetaData.java:1524) at org.apache.cassandra .config.CFMetaData.fromSchema(CFMetaData.java:1456) at org.apache.cassandra .config.KSMetaData.deserializeColumnFamilies(KSMetaData.java:306) at org.apache.cassandra .db.DefsTable.mergeColumnFamilies(DefsTable.java:444) at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:356) at org.apache.cassandra .service.MigrationTask$1.response(MigrationTask.java:66) at org.apache.cassandra .net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:47) at org.apache.cassandra .net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NumberFormatException: An hex string representing bytes must have an even length at org.apache.cassandra.utils.Hex.hexToBytes(Hex.java:52) at org.apache.cassandra .db.marshal.BytesType.fromString(BytesType.java:65) ... 12 more I am able to fix this error by clearing out the schema_columns system table on disk. After that, a node can boot successfully. Does anyone have a clue what's going on here? Thanks!