RE: Changing schema on multiple nodes while they are isolated

Jacques-Henri Berthemet Fri, 02 Oct 2015 09:17:37 -0700

Why don’t you simply let the node join the cluster? It will pull new tables and 
the data automatically.

--
Jacques-Henri Berthemet

From: Stephen Baynes [mailto:stephen.bay...@smoothwall.net]
Sent: vendredi 2 octobre 2015 18:08
To: user@cassandra.apache.org
Subject: Re: Changing schema on multiple nodes while they are isolated

Hi Jacques-Henri

You are right - serious trouble. I managed some more testing and it does not 
repair or share any data. In the logs I see lots of:

WARN  [MessagingService-Incoming-/10.50.16.214<http://10.50.16.214>] 2015-10-02 
16:52:36,810 IncomingTcpConnection.java:100 - UnknownColumnFamilyException 
reading from socket; closing
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
cfId=e6828dd0-691a-11e5-8a27-b1780df21c7c
     at 
org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:163)
 ~[apache-cassandra-2.2.1.jar:2.2.1]
     at 
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:96)
 ~[apache-cassandra-2.2.1.jar:2.2.1]

and some:

ERROR [AntiEntropyStage:1] 2015-10-02 16:48:16,546 
RepairMessageVerbHandler.java:164 - Got error, removing parent repair session
ERROR [AntiEntropyStage:1] 2015-10-02 16:48:16,548 CassandraDaemon.java:183 - 
Exception in thread Thread[AntiEntropyStage:1,5,main]
java.lang.RuntimeException: java.lang.NullPointerException
     at 
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:167)
 ~[apache-cassandra-2.2.1.jar:2.2.1]
     at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66) 
~[apache-cassandra-2.2.1.jar:2.2.1]

Will need to do some thinking about this. I wonder about shiping a backup of a 
good system keyspace and restore it on each node before it starts for the first 
time - but will that end up with each node having the same internal id?

On 2 October 2015 at 16:27, Jacques-Henri Berthemet 
<jacques-henri.berthe...@genesys.com<mailto:jacques-henri.berthe...@genesys.com>>
 wrote:
Hi Stephen,

If you manage to create tables on each node while node A and B are separated, 
you’ll get into troubles when they will reconnect again. I had the case 
previously and Cassandra complained that tables with same names but different 
ids were present in the keyspace. I don’t know if there is a way to fix that 
with nodetool but I don’t think that it is a good practice.

To solve this, we have a “schema creator” application node that is responsible 
to change the schema. If this node is down, schema updates are not possible. We 
can make any node ‘creator’, but only one can be enabled at any given time.
--
Jacques-Henri Berthemet

From: Stephen Baynes 
[mailto:stephen.bay...@smoothwall.net<mailto:stephen.bay...@smoothwall.net>]
Sent: vendredi 2 octobre 2015 16:46
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Changing schema on multiple nodes while they are isolated

Is it safe to make schema changes ( e.g. create keyspace and tables ) on 
multiple separate nodes of a cluster while they are out of communication with 
other nodes in the cluster? For example create on node A while node B is down, 
create on node B while A is down, then bring both up together.

We are looking to embed Cassandra invisibly in another product and we have no 
control in what order users may start/stop the nodes up or add/remove them from 
clusters. And Cassandra must come up and be working with at least local access 
regardless. So this means always creating keyspaces and tables so they are 
always present. But this means nodes joining clusters which already have the 
same keyspace and table defined. Will it cause any issues? I have done some 
testing and saw some some issues when I tried to nodetool repair to bring 
things into sync. However at the time I was fighting with what I later 
discovered was CASSANDRA-9689 keyspace does not show in describe list, if 
create query times out.<https://issues.apache.org/jira/browse/CASSANDRA-9689> 
and did not know what was what. I will give it another try sometime, but would 
appreciate knowing if this is going to run into trouble before we find it.

We are basically using Cassandra to share fairly transient information We can 
cope with data loss during environment changes and occasional losses at other 
times. But if the environment is stable then it should all just work, whatever 
the environment is. We use a very high replication factor so all nodes have a 
copy of all the data and will keep working even if they are the only one up.

Thanks

--

Stephen Baynes

Thanks
--

Stephen Baynes

RE: Changing schema on multiple nodes while they are isolated

Reply via email to