Re: What is the effect of reducing the thrift message sizes on GC

aaron morton Tue, 18 Jun 2013 00:58:11 -0700

> *thrift_framed_transport_size_in_mb & thrift_max_message_length_in_mb*
This control the max size of a bugger allocated by thrift when processing 
requests / responses. The buffers are not pre allocated, but once they are 
allocated they are not returned. So it's only an issue if have lots of clients 
connecting and reading a lot of data.


> Our system is a very short column (both in number of columns and data sizes
> ) tables but having millions/billions of rows in each column family.
If you have over 500 million rows per node you may be running into issues with 
the bloom filters and index samples. 

This typically looks like the heap usage does not reduce after CMS compaction 
has completed. 

Ensure the bloom_file_fp_chance on the CF's is set to 0.01 for size tiered 
compaction and 0.1 for levelled compaction. If you need to change it  run 
nodetool upgradesstables

Then consider increasing the index_interval in the yaml file, see the comments. 

Note that v 1.2 moves the bloom filters off heap, so if you upgrade to 1.2 it 
will probably resolve your issues. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 18/06/2013, at 7:30 PM, Ananth Gundabattula <agundabatt...@threatmetrix.com> 
wrote:

> We are currently running on 1.1.10 and planning to migrate to a higher
> version 1.2.4.
> 
> The question pertains to tweaking all the knobs to reduce GC related issues
> ( we have been fighting a lot of really bad GC issues on 1.1.10 and met with 
> little
> success all the way using 1.1.10)
> 
> Taking into consideration GC tuning is a black art, I was wondering if we
> can have some good effect on the GC by tweaking the following settings:
> 
> *thrift_framed_transport_size_in_mb & thrift_max_message_length_in_mb*
> *
> *
> Our system is a very short column (both in number of columns and data sizes
> ) tables but having millions/billions of rows in each column family. The 
> typical
> number of columns in each column family is 4. The typical lookup involves
> specifying the row key and fetching one column most of the times. The
> writes are also similar except for one keyspace where the number of columns
> are 50 but very small data sizes per column.
> 
> Assuming we can tweak the config values :
> *
> *
> * > thrift_framed_transport_size_in_mb & *
> * >  thrift_max_message_length_in_mb *
> 
> to lower values in the above context, I was wondering if it helps in the GC
> being invoked less if the thrift settings reflect our data model reads and 
> writes ?
> 
> For example: What is the impact by reducing the above config values on the
> GC to say 1 mb rather than say 15 or 16 ?
> 
> Thanks a lot for your inputs and thoughts.
> 
> 
> Regards,
> Ananth

Re: What is the effect of reducing the thrift message sizes on GC

Reply via email to