>> I really don't think I have more than 500 million rows ... any smart way to >> count rows number inside the ks? use the output from nodetool cfstats, it has a row count and bloom filter size for each CF.
You may also want to upgrade to 1.1 to get global cache management, that can make things easier to manage. Cheers ----------------- Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 23/07/2013, at 6:26 AM, Nate McCall <zznat...@gmail.com> wrote: > Do you have a copy of the specific stack trace? Given the version and > CL behavior, one thing you may be experiencing is: > https://issues.apache.org/jira/browse/CASSANDRA-4578 > > On Mon, Jul 22, 2013 at 7:15 AM, cbert...@libero.it <cbert...@libero.it> > wrote: >> Hi Aaron, thanks for your help. >> >>> If you have more than 500Million rows you may want to check the >> bloom_filter_fp_chance, the old default was 0.000744 and the new (post 1.) >> number is > 0.01 for sized tiered. >> >> I really don't think I have more than 500 million rows ... any smart way to >> count rows number inside the ks? >> >>>> Now a question -- why with 2 nodes offline all my application stop >> providing >>>> the service, even when a Consistency Level One read is invoked? >> >>> What error did the client get and what client are you using ? >>> it also depends on if/how the node fails. The later versions try to shut >>> down >> when there is an OOM, not sure what 1.0 does. >> >> The exception was a TTransportException -- I am using Pelops client. >> >>> Is the node went into a zombie state the clients may have been timing out. >> The should then move onto to another node. >>> If it had started shutting down the client should have gotten some immediate >> errors. >> >> It didn't shut down, it was more like in a zombie state, >> One more question: I'm experiencing some wrong counters (which are very >> important in my platform since the are used to keep user-points and generate >> the TopX users) --could it be related with this problem? The problem is that >> in >> some users (not all) the counter column increased its value. >> >> After such a crash in 1.0 is there any best-practice to follow? (nodetool or >> something?) >> >> Cheers, >> Carlo >> >>> >>> Cheers >>> >>> >>> ----------------- >>> Aaron Morton >>> Cassandra Consultant >>> New Zealand >>> >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 19/07/2013, at 5:02 PM, cbert...@libero.it wrote: >>> >>>> Hi all, >>>> I'm experiencing some problems after 3 years of cassandra in production >> (from >>>> 0.6 to 1.0.6) -- for 2 times in 3 weeks 2 nodes crashed with OutOfMemory >>>> Exception. >>>> In the log I can read the warn about the few heap available ... now I'm >>>> increasing a little bit my RAM, my Java Heap (1/4 of the RAM) and reducing >> the >>>> size of rows and memtables thresholds. Other tips? >>>> >>>> Now a question -- why with 2 nodes offline all my application stop >> providing >>>> the service, even when a Consistency Level One read is invoked? >>>> I'd expected this behaviour: >>>> >>>> CL1 operations keep working >>>> more than 80% of CLQ operations working (nodes offline where 2 and 5 in a >>>> clockwise key distribution only writes to fifth node should impact to node >> 2) >>>> most of all CLALL operations (that I don't use) failing >>>> >>>> The situation instead was that I had ALL services stop responding throwing >> a >>>> TTransportException ... >>>> >>>> Thanks in advance >>>> >>>> Carlo >>> >>> >> >>