Vytenis, I ran the `ALTER KEYSPACE` command on one of the original `dc1` nodes.
Should it make any difference? My understanding was that it could be run from any node in either datacenter. But, if there's a reason to prefer running it on a new datacenter node, I'm happy to do it that way. --Tom On Wed, Oct 13, 2021 at 10:22 AM vytenis silgalis <vsilga...@gmail.com> wrote: > You ran the `alter keyspace` command on the original dc1 nodes or the new > dc2 nodes? > > On Wed, Oct 13, 2021 at 8:15 AM Stefan Miklosovic < > stefan.mikloso...@instaclustr.com> wrote: > >> Hi Tom, >> >> while I am not completely sure what might cause your issue, I just >> want to highlight that schema agreements were overhauled in 4.0 (1) a >> lot so that may be somehow related to what that ticket was trying to >> fix. >> >> Regards >> >> (1) https://issues.apache.org/jira/browse/CASSANDRA-15158 >> >> On Fri, 1 Oct 2021 at 18:43, Tom Offermann <tofferm...@newrelic.com> >> wrote: >> > >> > When adding a datacenter to a keyspace (following the Last Pickle [Data >> Center Switch][lp] playbook), I ran into a "Configuration exception merging >> remote schema" error. The nodes in one datacenter didn't converge to the >> new schema version, and after restarting them, I saw the symptoms described >> in this Datastax article on [Fixing a table schema collision][ds], where >> there were two data directories for each table in the keyspace on the nodes >> that didn't converge. I followed the recovery steps in the Datastax article >> to move the data from the older directories to the new directories, ran >> `nodetool refresh`, and that fixed the problem. >> > >> > [lp]: https://thelastpickle.com/blog/2019/02/26/data-center-switch.html >> > [ds]: >> https://docs.datastax.com/en/dse/6.0/cql/cql/cql_using/useCreateTableCollisionFix.html >> > >> > While the Datastax article was super helpful for helping me recover, >> I'm left wondering *why* this happened. If anyone can shed some light on >> that, or offer advice on how I can avoid getting in this situation in the >> future, I would be most appreciative. I'll describe the steps I took in >> more detail in the thread. >> > >> > ## Steps >> > >> > 1. The day before, I had added the second datacenter ('dc2') to the >> system_traces, system_distributed, and system_auth keyspaces and ran >> `nodetool rebuild` for each of the 3 keyspaces. All of that went smoothly >> with no issues. >> > >> > 2. For a large keyspace, I added the second datacenter ('dc2') with an >> `ALTER KEYSPACE foo WITH replication = {'class': 'NetworkTopologyStrategy', >> 'dc1': '2', 'dc2': '3'};` statement. Immediately, I saw this error in the >> log: >> > ``` >> > "ERROR 16:45:47 Exception in thread Thread[MigrationStage:1,5,main]" >> > "org.apache.cassandra.exceptions.ConfigurationException: Column >> family ID mismatch (found 8ad72660-f629-11eb-a217-e1a09d8bc60c; expected >> 20739eb0-d92e-11e6-b42f-e7eb6f21c481)" >> > "\tat >> org.apache.cassandra.config.CFMetaData.validateCompatibility(CFMetaData.java:949) >> ~[apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:903) >> ~[apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> org.apache.cassandra.config.Schema.updateTable(Schema.java:687) >> ~[apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> org.apache.cassandra.schema.SchemaKeyspace.updateKeyspace(SchemaKeyspace.java:1482) >> ~[apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1438) >> ~[apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1407) >> ~[apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1384) >> ~[apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:594) >> ~[apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) >> ~[apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >> ~[na:1.8.0_232]" >> > "\tat java.util.concurrent.FutureTask.run(FutureTask.java:266) >> ~[na:1.8.0_232]" >> > "\tat >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> ~[na:1.8.0_232]" >> > "\tat >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> [na:1.8.0_232]" >> > "\tat >> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) >> [apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_232]" >> > ``` >> > >> > I also saw this: >> > ``` >> > "ERROR 16:46:48 Configuration exception merging remote schema" >> > "org.apache.cassandra.exceptions.ConfigurationException: Column >> family ID mismatch (found 8ad72660-f629-11eb-a217-e1a09d8bc60c; expected >> 20739eb0-d92e-11e6-b42f-e7eb6f21c481)" >> > "\tat >> org.apache.cassandra.config.CFMetaData.validateCompatibility(CFMetaData.java:949) >> ~[apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:903) >> ~[apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> org.apache.cassandra.config.Schema.updateTable(Schema.java:687) >> ~[apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> org.apache.cassandra.schema.SchemaKeyspace.updateKeyspace(SchemaKeyspace.java:1482) >> ~[apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1438) >> ~[apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1407) >> ~[apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1384) >> ~[apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> org.apache.cassandra.service.MigrationTask$1.response(MigrationTask.java:91) >> ~[apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> > org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:53) >> [apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66) >> [apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >> [na:1.8.0_232]" >> > "\tat java.util.concurrent.FutureTask.run(FutureTask.java:266) >> [na:1.8.0_232]" >> > "\tat >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> [na:1.8.0_232]" >> > "\tat >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> [na:1.8.0_232]" >> > "\tat >> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) >> [apache-cassandra-3.11.5.jar:3.11.5]" >> > "\tat java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_232]" >> > ``` >> > This error repeated several times over the next 2 minutes. >> > >> > 3. While running `nodetool describecluster` repeatedly, I saw that the >> nodes in the 'dc2' datacenter converged to the new schema version quickly, >> but the nodes in the original datacenter ('dc1') remained at the previous >> schema version. >> > >> > 4. I waited to see if all of the nodes would converge to the new schema >> version, but they still hadn't converged after roughly 10 minutes. Given >> the errors I saw, I wasn't optimistic it would work out all by itself, so I >> decided to restart the nodes in the 'dc1' datacenter one at a time so they >> would restart with the latest schema version. >> > >> > 5. After each node restarted, `nodetool describecluster` showed it as >> being on the latest schema version. So, after getting through all the 'dc1' >> nodes, it looked like everything in the cluster was healthy again. >> > >> > 6. However, that's when I noticed that there were two data directories >> on disk for each table in the keyspace. New writes for a table were being >> saved in the newer directory, but queries for data saved in the old data >> directory were returning no results. >> > >> > 7. That's when I followed the recovery steps in the Datastax article >> with great success. >> > >> > ## Questions >> > >> > * My understanding is that running concurrent schema updates should >> always be avoided, since that can result in schema collisions. But, in this >> case, I wasn't performing multiple schema updates. I was just running a >> single `ALTER KEYSPACE` statement. Any idea why a single schema update >> would result in a schema collision and two data directories per table? >> > >> > * Should I have waited longer before restarting nodes? Perhaps, given >> enough time, the Cassandra nodes would have all converged on the correct >> schema version, and this would have resolved on it's own? >> > >> > * Any suggestions for how I can avoid this problem in the future? >> > >> > -- >> > Tom Offermann >> > Lead Software Engineer >> > http://newrelic.com >> > -- Tom Offermann Lead Software Engineer http://newrelic.com