How about this? #!/bin/sh ... for i in data/expiring_bitcask/* ; do $ERTS_PATH/to_erl $PIPE_DIR<<EOF bitcask:merge("$i/",[{dead_bytes_merge_trigger, 0},{dead_bytes_threshold, 0},{small_file_threshold, 16#80000000},{expiry_secs, 86400}]). EOF done
I have a backend (part of a multi_backend) used exclusively for storing data with 9-hour TTL (and so my config expires the data after 24 hours). It's seldom that k/v's are deleted, so merging during my nightly merge window only slightly compacts the data files into nice, huge, zero-dead-byte files that never again will trigger a merge, and are included in merges triggered by new files since they fall under the small_file_threshold (which is huge for this reason). >From app.config: {<<"expiring_bitcask">>, riak_kv_bitcask_backend, [ {expiry_secs, 86400}, {max_file_size, 134217728}, {dead_bytes_merge_trigger, 10485760}, {dead_bytes_threshold, 10485760}, {small_file_threshold, 16#80000000}, {merge_window, {1, 5}}, {data_root, "data/expiring_bitcask"} I use dead-byte trigger and threshold values of 0 for manual merges in order to force a merge, and 10mb values in the config to avoid continual merging during my overnight window. Does this make sense? I'd have thought my app.config would keep the bitcasks small, but it seems they just grow bigger and never get cleared of much expired data. The manual config I use above immediately shrinks the data files dramatically. Any suggestions for a cleaner approach? -Nate ________________________________ From: riak-users-boun...@lists.basho.com [riak-users-boun...@lists.basho.com] On Behalf Of Anthony Molinaro [antho...@alumni.caltech.edu] Sent: Monday, August 22, 2011 10:18 AM To: Dan Reverri Cc: raghwani sohil; riak-users@lists.basho.com Subject: Re: Riak Bitcask merging While I didn't ask this time, I'll explain why I think manual merging as an option would be great. As far as I know specifying a merge window doesn't guarantee the merging happens, only that it might if other thresholds are met. With our cassandra cluster we've ended up scheduling twice weekly full compactions (the close equivalent to merging I believe), via cron. The days and times are specificaly chosen based on traffic patterns, and can be changed without restarting the servers. We don't have this convenience with riak. We can set it up to only merge during a window, but can't guarantee everything was merged or even that any merges will occur. If I wanted to change when I do the merging, I have to restart the servers to pick up the new config (at least I think I would). I have no way to stagger the merging (have different nodes merge at different times), unless I have slightly different config on each node. If there were a riak-admin command called merge/compact/cleanup/expire or something which triggered a manual merge I know we would use it. And while I'm pining for additional command line tools, any idea if transfers will ever work without disrupting the actual transfers? It's sort of annoying that it's listed as one of the steps for a rolling upgrade/restart but if you actually use it, it can cause upgrade or startup to take longer. Also, it tends to timeout on a highly trafficked cluster. Anyway, sorry about hijacking someone else's question, but figured more information from users is usually welcome? -Anthony On Mon, Aug 22, 2011 at 09:14:06AM -0700, Dan Reverri wrote: > There is no way to manually trigger a Bticask merge. What's the use case for > needing to manually trigger the merge? Are you concerned about the size of > the data files? Are you trying to avoid merging at a particular time? > > Not sure if this will help but you can restrict Bitcask merging to a > specified window of time: > http://wiki.basho.com/Bitcask-Configuration.html#Merge-Window > > Thanks, > Dan > > Daniel Reverri > Developer Advocate > Basho Technologies, Inc. > d...@basho.com > > > On Sun, Aug 21, 2011 at 11:36 PM, raghwani sohil <sohil4...@gmail.com>wrote: > > > > > Hi All, > > > > Is there any way to run bitcask merging process manually ? > > > > thanks , > > Sohil Raghwani . > > > > > > _______________________________________________ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com -- ------------------------------------------------------------------------ Anthony Molinaro <antho...@alumni.caltech.edu> _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com