Re: 2.0.10 to 2.0.11 upgrade and immediate ParNew and CMS GC storm

Alain RODRIGUEZ Mon, 29 Dec 2014 02:31:55 -0800

Hi,

Sorry about the gravedigging, but what would be a good start value to tune "
rpc_max_threads" ?


I mean, default is unlimited, the value commented is 2048. Native protocol
seems to only allow 128 simultaneous threads. Should I stick to 2048 or try
with something closer to 128 or even something else ?

About HSHA, I have tried this mode from time to time since C* 0.8 and
always faced the "ERROR 12:02:18,971 Read an invalid frame size of 0. Are
you using TFramedTransport on the client side?" error)". I haven't try for
a while (1 year maybe), has this been fixed, or is this due to my
configuration somehow ?

C*heers

Alain

2014-10-29 16:07 GMT+01:00 Peter Haggerty <peter.hagge...@librato.com>:

> That definitely appears to be the issue. Thanks for pointing that out!
>
> https://issues.apache.org/jira/browse/CASSANDRA-8116
> It looks like 2.0.12 will check for the default and throw an exception
> (thanks Mike Adamson) and also includes a bit more text in the config
> file but I'm thinking that 2.0.12 should be pushed out sooner rather
> than later as anyone using hsha and the default settings will simply
> have their cluster stop working a few minutes after the upgrade and
> without any indication of the actual problem.
>
>
> Peter
>
>
> On Wed, Oct 29, 2014 at 5:23 AM, Duncan Sands <duncan.sa...@gmail.com>
> wrote:
> > Hi Peter, are you using the hsha RPC server type on this node?  If you
> are,
> > then it looks like rpc_max_threads threads will be allocated on startup
> in
> > 2.0.11 while this wasn't the case before.  This can exhaust your heap if
> the
> > value of rpc_max_threads is too large (eg if you use the default).
> >
> > Ciao, Duncan.
> >
> >
> > On 29/10/14 01:08, Peter Haggerty wrote:
> >>
> >> On a 3 node test cluster we recently upgraded one node from 2.0.10 to
> >> 2.0.11. This is a cluster that had been happily running 2.0.10 for
> >> weeks and that has very little load and very capable hardware. The
> >> upgrade was just your typical package upgrade:
> >>
> >> $ dpkg -s cassandra | egrep '^Ver|^Main'
> >> Maintainer: Eric Evans <eev...@apache.org>
> >> Version: 2.0.11
> >>
> >> Immediately after started it ran a couple of ParNews and then started
> >> executing CMS runs. In 10 minutes the node had become unreachable and
> >> was marked as down by the two other nodes in the ring, which are still
> >> 2.0.10.
> >>
> >> We have jstack output and the server logs but nothing seems to be
> >> jumping out. Has anyone else run into this? What should we be looking
> >> for?
> >>
> >>
> >> Peter
> >>
> >
>

Re: 2.0.10 to 2.0.11 upgrade and immediate ParNew and CMS GC storm

Reply via email to