Thanks Brian. This is very helpful and informative!

Only question is we do indeed have data files that are > 2GB that haven't
merged. I have a data file at 2.1G, as of today. The last merged happened
Jan 20, which was two days ago.

Anything we are missing here?


On Sat, Jan 19, 2013 at 2:01 PM, Brian Sparrow <bspar...@basho.com> wrote:

>  Hi Ian,
>
> Q: Why does this happen on a restart only and not at other times, even
> though we have set our merge_window config setting to 'always'?
>
> A: The merge_window setting defaults to always and that simply means that
> bitcask will merge anytime the merge_triggers and thresholds set in the
> app.config are satisfied on a closed bitcask file. A bitcask data file is
> closed when it reaches the max_file_size threshold set in
> app.config(default of 2GB) or the node is stopped. Merging is happening on
> restart because this file is now closed and eligible for merge. You are not
> seeing merging during normal operation because the files are not reaching
> the 2GB max_file_size.
>
> Q: I'm assuming our config values are correct as the restart considers the
> thresholds met to go ahead with a merge.
>
> A: I do not see anything wrong with your configuration with the exception
> of no max_file_size being set. If you want to merge more often, reduce the
> max_file size to 1GB or lower(the lower the file size, the more merging). A
> good way to estimate the appropriate size is to look at how large your
> .data files are getting in your bitcask data directory and deciding from
> there what size threshold you would like to set based on your disk usage
> needs.
>
> Q: Also: Are there any tools out there that I can run on my data directory
> that will tell me the size of the dead byte ratios? Would be nice to see
> what the dead byte 'state' of my data dir is in so I can tell whether
> indeed merge conditions are met.
>
> A: The command you are looking for is riak-admin vnode_status. This will
> list all partitions on the local node as well as the number of keys in each
> partition and a list of closed bitcask data files in the format:
>
> {CLOSE_DATA_FILE_NAME, FRAG_PERCENTAGE, DEAD_BYTES, TOTAL_FILE_SIZE}
>
> Again, a data file must be closed(reached max_file_size or closed by node
> stop) to be reported in this command.
>
> Hope this answers all your questions.
>
> Thanks!
>
>
> --
> Brian Sparrow
> Customer Service Engineer
> Basho Technologies
>
> On Friday, January 18, 2013 at 5:14 PM, Ian Ha wrote:
>
> Hi,
>
> In our production system, we notice that merges are not taking place. We
> have noticed, however, that when we restart riak (via a 'riak stop' then a
> 'riak start'), then the merges are triggered and our disk usage goes way
> day (which is what we want). We use bitcask.
>
> Why does this happen on a restart only and not at other times, even though
> we have set our merge_window config setting to 'always'?
>
> I'm assuming our config values are correct as the restart considers the
> thresholds met to go ahead with a merge.
>
> Also: Are there any tools out there that I can run on my data directory
> that will tell me the size of the dead byte ratios? Would be nice to see
> what the dead byte 'state' of my data dir is in so I can tell whether
> indeed merge conditions are met.
>
> Our bit cask configs are as follows in app.config:
>
> {bitcask, [
>                 {data_root, "/var/lib/riak/bitcask"},
>                 {dead_bytes_merge_trigger, 268435456},
>                 {dead_bytes_threshold, 134217728},
>                 {expiry_secs, 3888000},
>                 {frag_merge_trigger, 40},
>                 {frag_threshold, 20},
>                 {merge_window, always},
>                 {small_file_threshold, 10485760},
>                 {sync_strategy, none}
>         ]},
>
>
> Thanks!
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to