Hi Michael,

I had critical issues using 1.2 (.11, I believe) around gossip (but it was
like 2 years ago...).

Are you using the last C* 1.2.19 minor version? If not, you probably should
go there asap.

A lot of issues like this one
https://issues.apache.org/jira/browse/CASSANDRA-6297 have been fixed since
then on C* 1.2, 2.0, 2.1, 2.2, 3.0.X, 3.X. You got to go through steps to
upgrade. It should be safe and enough to go to the last 1.2 minor to solve
this issue.

For your information, even C* 2.0 is no longer supported. The minimum
version you should use now is 2.1.last.

This technical debt might end up costing you more in terms of time, money
and Quality of Service that taking care of upgrades. The most probable
thing is that your bug is fixed already on newer versions. Plus it is not
very interesting for us to help you as we would have to go through old
code, to find issues that are most likely already fixed. If you want some
support (from community or commercial one) you really should upgrade this
cluster. Make sure your clients are compatible too.

I did not know that some people were still using C* < 2.0 :-).

Cheers,
-----------------------
Alain Rodriguez - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-04-13 10:58 GMT+02:00 Michael Fong <michael.f...@ruckuswireless.com>:

> Hi, all
>
>
>
>
>
> We have been a Cassandra 4-node cluster (C* 1.2.x) where a node marked all
> the other 3 nodes DOWN, and came back UP a few seconds later. There was a
> compaction that kicked in a minute before, roughly 10~MB in size, followed
> by marking all the other nodes DOWN later. In the other words, in the
> system.log we see
>
> 00:00:00 Compacting ….
>
> 00:00:03 Compacted 8 sstables … 10~ megabytes
>
> 00:01:06 InetAddress /x.x.x.4 is now DOWN
>
> 00:01:06 InetAddress /x.x.x.3 is now DOWN
>
> 00:01:06 InetAddress /x.x.x.1 is now DOWN
>
>
>
> There was no significant GC activities in gc.log. We have heard that busy
> compaction activities would cause this behavior, but we cannot reason why
> this could happen logically. How come a compaction operation would stop the
> Gossip thread to perform heartbeat check? Has anyone experienced this kind
> of behavior before?
>
>
>
> Thanks in advanced!
>
>
>
> Sincerely,
>
>
>
> Michael Fong
>

Reply via email to