Thanks Aaron for the insight. One quick question:
>The buffers are not pre allocated, but once they are allocated they are >not returned. So it's only an issue if have lots of clients connecting >and reading a lot of data. So to understand you correctly, the buffer is allocated per client connection and remains all the while during the JVM and is reused for each request ? If that is the case, then I am presuming there is no much gain by playing around with this config with respect to optimizing for Gcs. >reduce bloom filters, index intervals Š. Well we have tried all the configs as advised below (and others like key cache sizes etc ) and hit a dead end and that is the reason for a 1.2.4 move. Thanks for all your thoughts and advice on this. Regards, Ananth On 6/18/13 5:56 PM, "aaron morton" <aa...@thelastpickle.com> wrote: >> *thrift_framed_transport_size_in_mb & thrift_max_message_length_in_mb* >This control the max size of a bugger allocated by thrift when processing >requests / responses. The buffers are not pre allocated, but once they >are allocated they are not returned. So it's only an issue if have lots >of clients connecting and reading a lot of data. > >> Our system is a very short column (both in number of columns and data >>sizes >> ) tables but having millions/billions of rows in each column family. >If you have over 500 million rows per node you may be running into issues >with the bloom filters and index samples. > >This typically looks like the heap usage does not reduce after CMS >compaction has completed. > >Ensure the bloom_file_fp_chance on the CF's is set to 0.01 for size >tiered compaction and 0.1 for levelled compaction. If you need to change >it run nodetool upgradesstables > >Then consider increasing the index_interval in the yaml file, see the >comments. > >Note that v 1.2 moves the bloom filters off heap, so if you upgrade to >1.2 it will probably resolve your issues. > >Cheers > >----------------- >Aaron Morton >Freelance Cassandra Consultant >New Zealand > >@aaronmorton >http://www.thelastpickle.com > >On 18/06/2013, at 7:30 PM, Ananth Gundabattula ><agundabatt...@threatmetrix.com> wrote: > >> We are currently running on 1.1.10 and planning to migrate to a higher >> version 1.2.4. >> >> The question pertains to tweaking all the knobs to reduce GC related >>issues >> ( we have been fighting a lot of really bad GC issues on 1.1.10 and met >>with little >> success all the way using 1.1.10) >> >> Taking into consideration GC tuning is a black art, I was wondering if >>we >> can have some good effect on the GC by tweaking the following settings: >> >> *thrift_framed_transport_size_in_mb & thrift_max_message_length_in_mb* >> * >> * >> Our system is a very short column (both in number of columns and data >>sizes >> ) tables but having millions/billions of rows in each column family. >>The typical >> number of columns in each column family is 4. The typical lookup >>involves >> specifying the row key and fetching one column most of the times. The >> writes are also similar except for one keyspace where the number of >>columns >> are 50 but very small data sizes per column. >> >> Assuming we can tweak the config values : >> * >> * >> * > thrift_framed_transport_size_in_mb & * >> * > thrift_max_message_length_in_mb * >> >> to lower values in the above context, I was wondering if it helps in >>the GC >> being invoked less if the thrift settings reflect our data model reads >>and writes ? >> >> For example: What is the impact by reducing the above config values on >>the >> GC to say 1 mb rather than say 15 or 16 ? >> >> Thanks a lot for your inputs and thoughts. >> >> >> Regards, >> Ananth >