Re: Live upgrade 2.0 to 2.1 temporarily increases GC time causing timeouts and unavailability

2016-02-19 Thread Sotirios Delimanolis
We're not all the way there yet with native. But the increased GC time is temporary, only during the deployment. After all nodes are on 2.1, everything is smooth. On Friday, February 19, 2016 1:47 PM, daemeon reiydelle wrote: FYI, my observations were with native, not thrift. ..

Re: Live upgrade 2.0 to 2.1 temporarily increases GC time causing timeouts and unavailability

2016-02-19 Thread daemeon reiydelle
FYI, my observations were with native, not thrift. *...* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Fri, Feb 19, 2016 at 10:12 AM, Sotirios Delimanolis wrote: > Does your cluster contain 24+ nodes or fewer? > > We did the same upgrade on a smaller clus

Re: Live upgrade 2.0 to 2.1 temporarily increases GC time causing timeouts and unavailability

2016-02-19 Thread Sotirios Delimanolis
Does your cluster contain 24+ nodes or fewer?  We did the same upgrade on a smaller cluster of 5 nodes and we didn't see this behavior. On the 24 node cluster, the timeouts only took effect once ~5-6-7+ nodes had been upgraded. We're doing some more upgrades next week, trying different deployment

Re: Live upgrade 2.0 to 2.1 temporarily increases GC time causing timeouts and unavailability

2016-02-19 Thread daemeon reiydelle
May be unrelated, but I found highly variable latency (latency max) when on the 2.1 code tree loading new data (and reading). Others found that G1 or CMS do not make a difference. Some evidence that 8/12/16gb memory make no difference. These were latencies in the 10-30 SECOND range. It did cause ti

Re: Live upgrade 2.0 to 2.1 temporarily increases GC time causing timeouts and unavailability

2016-02-19 Thread Alain RODRIGUEZ
I performed this exact update a few days ago, excepted clients were using native protocol and it wen smoothly. So I think this might be thrift related. No idea what is producing this though, just wanted to give the info fwiw. As a side note, unrelated to the issue, performances using native are a

Live upgrade 2.0 to 2.1 temporarily increases GC time causing timeouts and unavailability

2016-02-18 Thread Sotirios Delimanolis
We have a Cassandra cluster with 24 nodes. These nodes were running 2.0.16.  While the nodes are in the ring and handling queries, we perform the upgrade to 2.1.12 as follows (more or less) one node at a time: - Stop the Cassandra process - Deploy jars, scripts, binaries, etc. - Start