Hello Jeff,

Yes 2.1.16 is old version, and we are planning to upgrade in few months.

Only the gossiper info is logged stating that it marked several nodes down
and nothing else.


On Wed, Jun 28, 2017 at 8:15 PM, Jeff Jirsa <jji...@apache.org> wrote:

>
>
> On 2017-06-28 18:51 (-0700), Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com> wrote:
> > Hello,
> >
> > We are using C* version 2.1.6 and lately we are seeing an issue where,
> > nodetool removenode causing the schema to go out of sync and causing
> client
> > to fail for 2-3 minutes.
> >
> > C* cluster is in 8 Datacenters with RF=3 and has 50 nodes.
> > We have 130 Keyspaces and 500 CF in the cluster.
> >
> > Here are the sequence of actions that were performed.
> >
> > 1. One node failed abruptly in the cluster due to hardware issue
> > 2. Remove the node from the cluster using nodetool removenode from a live
> > node.
> > 3. Immediately I see all the nodes schema go out of sync and on the logs
> of
> > all the C* nodes, I see they mark few other (random) nodes as down. and
> > eventually recover after 2 minutes
> >
> > Logs in the nodes:
> >
> > INFO  [GossipTasks:1] 2017-06-27 20:34:39,707 Gossiper.java:1008 -
> > InetAddress /10.10.10.20 is now DOWN
> > INFO  [GossipTasks:1] 2017-06-27 20:34:39,714 Gossiper.java:1008 -
> > InetAddress /10.10.11.14 is now DOWN
> >
> > Any one have an idea why, removenode causing the cluster to go out of
> sync?
> >
>
> That's not really expected - I've never seen behavior like that. However,
> 2.1.6 is pretty old (just about 2 years, give or take), there have been
> hundreds or (more likely) thousands of fixes since then.
>
> Is the gossiper line the only thing logged? Anything about invalid
> generations?
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

Reply via email to