Morning Rusty,

Thanks very much for your time and trouble.  Great info, very helpful, and very 
timely!

Regards,

--gordon


On May 24, 2011, at 16:32 , Rusty Klophaus wrote:

Hi Gordon,

I have limited knowledge of configuring Innostore but can help answer some of 
your merge_index questions.

The most important merge_index setting in terms of memory usage is 
'buffer_rollover_size'. This affects how large the buffer is allowed to grow, 
in bytes, before getting converted to an on-disk segment. Each partition 
maintains a separate buffer, so any increases to this number will be multiplied 
by the number of partitions in your system. The higher this number, the less 
frequently merge_index will need to perform compactions.

The second most important settings for memory usage are a combination of 
'segment_full_read_size' and 'max_compact_segments'. During compaction, the 
system will completely page any segments smaller than the 
'segment_full_read_size' value into memory. This should generally be as large 
or larger than the 'buffer_rollover_size'. The higher this number, the quicker 
each compaction will be. 'max_compact_segments' is the maximum number of 
segments to compact at one time. The higher this number, the more segments 
merge_index can involve in each compaction. In the worst case, a compaction 
could take ('segment_full_read_size' * 'max_compact_segments') bytes of RAM.

The rest of the settings have a much smaller impact on performance and memory 
usage, and exist mainly for tweaking and special cases.

This is a completely unscientific estimate based on observing other Riak Search 
applications, but I'd set buffer_rollover_size so that (# Partitions * 
buffer_rollover_size) is about one-half the memory you wish for merge_index to 
consume, hopefully somewhere between 1M and 10M. The rest of the memory will be 
used by in-memory offset tables, compaction processes, and during query 
operations.

Hope that helps.

Best,
Rusty


On Mon, May 23, 2011 at 2:05 PM, Gordon Tillman 
<gtill...@mezeo.com<mailto:gtill...@mezeo.com>> wrote:
Greetings!

We are working with a riaksearch cluster that uses innostore as the primary 
backend in tandem with merge_index that is required by search.  From reading 
the Basho wiki it looks like the following are the most important factors 
affecting memory and performance:

       • innostore
               • put data_home_dir and log_group_home_dir on different spindles
               • noatime
               • buffer_pool_size
               • flush_method
       • merge_index
               • data_root
               • buffer_rollover_size
               • max_compact_segments
               • segment_file_buffer_size
               • segment_full_read_size
               • segment_block_size

Ideally, data_home_dir, log_group_home_dir, and data_root would all be on 
different spindles, but if you had just 2 disks available what would you 
recommend?  Would it be best to have data_home_dir and data_root on one and 
then log_group_home_dir on the other?

in calculating the proper setting for buffer_pool_size you are directed to 
allocate 60-80 percent of available RAM.  So lets assume you want to take the 
remaining 20-40% of available RAM and split it up between innostore and 
merge_index?

Would it be best to give each of them half of that value?

Determining the approximate memory requirements for merge_index isn't (to me) 
real obvious.  I looks like the following all have  an effect:

 * buffer_rollover_size
 * buffer_delayed_write_size
 * max_compact_segments
 * segment_query_read_ahead_size
 * segment_compaction_read_ahead_size
 * segment_full_read_size
 * segment_block_size
 * segment_values_staging_size

Is there a formula for determining the (approximate) proper values to use given 
a certain amount of available RAM?

Thanks in advance for any advice.  Sorry for all the questions!

--gordon



_______________________________________________
riak-users mailing list
riak-users@lists.basho.com<mailto:riak-users@lists.basho.com>
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to