Hi Rob, Are the schema's held somewhere else ? Going through the process that you sent, when I restart the nodes, the original schema's show up (btw, you were correct on your assessment, even though the schema shows they are the same with the gossipinfo command, they are not the same when looking at them with cassandra-cli, not even close on 2 of the nodes). So, I went through the process of clearing out the system CF's, in steps 4 and 5, when the cassandra's restarted two of them (the ones with the incorrect schema's), complained about the schema and loaded what looks like a generic one. But, all of them have schemas and 2 are correct and one is not.
This means I cannot execute step 7 , since the schema now exists with the name on all the nodes. For example, the incorrect schema is called MySchema, after the restart and the messages complaining about CF's not existing, there is a schema called MySchema, on 2 nodes they are correct, on 2 nodes they are not. I have also tried to force the node with the incorrect schema to come up on its own by shutting down the cluster except for a node with a correct schema. I went through the same steps and brought that node down and back up, same results. Thoughts ? ideas ? Jim From: Robert Coli <rc...@eventbrite.com<mailto:rc...@eventbrite.com>> Reply-To: <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Date: Tue, 9 Jul 2013 17:10:53 -0700 To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Subject: Re: alter column family ? On Tue, Jul 9, 2013 at 11:52 AM, Langston, Jim <jim.langs...@compuware.com<mailto:jim.langs...@compuware.com>> wrote: > > On the command (4 node cluster): > > nodetool gossipinfo -h localhost |grep SCHEMA |sort | uniq -c | sort -n > 4 SCHEMA:60edeaa8-70a4-3825-90a5-d7746ffa8e4d If your schemas actually agree (and given that you're in 1.1.2) you probably are encountering : https://issues.apache.org/jira/browse/CASSANDRA-4432 Which is one of the 1.1.2 era "schema stuck" issues I was referring to earlier. > On the second part, I have the same Cassandra version in staging and > production, with staging being a smaller cluster. Not sure what you mean > by nuking schema's (ie. delete directories ?) I like when googling things returns related threads in which I have previously advised people to do a detailed list of things, heh : http://mail-archives.apache.org/mod_mbox/cassandra-user/201208.mbox/%3CCAN1VBD-01aD7wT2w1eyY2KpHwcj+CoMjvE4=j5zaswybmw_...@mail.gmail.com%3E Here's a slightly clarified version of these steps... 0. Dump your existing schema to schema_definition_file 1. Take all nodes out of service; 2. Run nodetool drain on each and verify that they have drained (grep -i DRAINED system.log) 3. Stop cassandra on each node; 4. Move /var/lib/cassandra/data/system out of the way 5. Move /var/lib/cassandra/saved_caches/system-* out of the way 6. Start all nodes; 7. cassandra-cli < schema_definition_file on one node only. (includes create keyspace and create column familiy entries) Note: you should not literally do this, you should break your schema_definition_file into individual statements and wait until schema agreement between each DDL statement. 8. Put the nodes back in service. 9. Done. =Rob