Hi, Sorry about the gravedigging, but what would be a good start value to tune " rpc_max_threads" ?
I mean, default is unlimited, the value commented is 2048. Native protocol seems to only allow 128 simultaneous threads. Should I stick to 2048 or try with something closer to 128 or even something else ? About HSHA, I have tried this mode from time to time since C* 0.8 and always faced the "ERROR 12:02:18,971 Read an invalid frame size of 0. Are you using TFramedTransport on the client side?" error)". I haven't try for a while (1 year maybe), has this been fixed, or is this due to my configuration somehow ? C*heers Alain 2014-10-29 16:07 GMT+01:00 Peter Haggerty <peter.hagge...@librato.com>: > That definitely appears to be the issue. Thanks for pointing that out! > > https://issues.apache.org/jira/browse/CASSANDRA-8116 > It looks like 2.0.12 will check for the default and throw an exception > (thanks Mike Adamson) and also includes a bit more text in the config > file but I'm thinking that 2.0.12 should be pushed out sooner rather > than later as anyone using hsha and the default settings will simply > have their cluster stop working a few minutes after the upgrade and > without any indication of the actual problem. > > > Peter > > > On Wed, Oct 29, 2014 at 5:23 AM, Duncan Sands <duncan.sa...@gmail.com> > wrote: > > Hi Peter, are you using the hsha RPC server type on this node? If you > are, > > then it looks like rpc_max_threads threads will be allocated on startup > in > > 2.0.11 while this wasn't the case before. This can exhaust your heap if > the > > value of rpc_max_threads is too large (eg if you use the default). > > > > Ciao, Duncan. > > > > > > On 29/10/14 01:08, Peter Haggerty wrote: > >> > >> On a 3 node test cluster we recently upgraded one node from 2.0.10 to > >> 2.0.11. This is a cluster that had been happily running 2.0.10 for > >> weeks and that has very little load and very capable hardware. The > >> upgrade was just your typical package upgrade: > >> > >> $ dpkg -s cassandra | egrep '^Ver|^Main' > >> Maintainer: Eric Evans <eev...@apache.org> > >> Version: 2.0.11 > >> > >> Immediately after started it ran a couple of ParNews and then started > >> executing CMS runs. In 10 minutes the node had become unreachable and > >> was marked as down by the two other nodes in the ring, which are still > >> 2.0.10. > >> > >> We have jstack output and the server logs but nothing seems to be > >> jumping out. Has anyone else run into this? What should we be looking > >> for? > >> > >> > >> Peter > >> > > >