Hi David, That helps. I assumed it was non-blocking, but that compaction is a pretty disk intensive operation while it's occurring. Just because it's occurring on a single partition, the compaction I/O load will affect the partitions on the same disk.
-J On Fri, Jun 11, 2010 at 12:27 PM, David Smith <diz...@basho.com> wrote: > Well, compaction does not mean that a partition is unavailable -- that is to > say, compaction happens in a non-blocking manner. So worst, case you'lll > have disk-related latency hits for a given partition, but requests should > still be getting served. Also, a single node only compacts a single > partition at a time. > FWIW, in my own testing, with a 50/50 read/write mix, compaction (based on > fragmentation %) typically doesn't happen that often, particularly when you > have a cluster of machines. > Hope that helps. > D. > > On Fri, Jun 11, 2010 at 12:20 PM, Jason J. W. Williams > <jasonjwwilli...@gmail.com> wrote: >> >> Is it smart enough to coordinate with the other partitions to ensure >> not more than 25% (just a plug number) of the partitions are >> compacting at the same time? It would seem to me there's the >> possibility for a performance drop if you had the perfect storm of too >> many shards compacting at the same time. >> >> -J >> >> On Fri, Jun 11, 2010 at 4:54 AM, Justin Sheehy <jus...@basho.com> wrote: >> > Hi, Germain. >> > >> > On Fri, Jun 11, 2010 at 11:07 AM, Germain Maurice >> > <germain.maur...@linkfluence.net> wrote: >> > >> >> Because of its append-only nature, stale data are created, so, how does >> >> Bitcask to remove stale data ? >> > >> > An excellent question, and one that we haven't yet written enough about. >> > >> >> With CouchDB the compaction process on our data never succeed, too much >> >> data. >> >> I really don't like to have to launch manually this kind of process. >> > >> > Bitcask's merging (compaction) process is automated and very tunable. >> > These parameters are the most relevant in your bitcask section of >> > app.config: >> > >> > (see the whole thing at >> > http://hg.basho.com/bitcask/src/tip/ebin/bitcask.app) >> > >> > %% Merge trigger variables. Files exceeding ANY of these >> > %% values will cause bitcask:needs_merge/1 to return true. >> > %% >> > {frag_merge_trigger, 60}, % >= 60% fragmentation >> > {dead_bytes_merge_trigger, 536870912}, % Dead bytes > 512 MB >> > >> > %% Merge thresholds. Files exceeding ANY of these values >> > %% will be included in the list of files marked for merging >> > %% by bitcask:needs_merge/1. >> > %% >> > {frag_threshold, 40}, % >= 40% fragmentation >> > {dead_bytes_threshold, 134217728}, % Dead bytes > 128 MB >> > {small_file_threshold, 10485760}, % File is < 10 MB >> > >> > Every few minutes, the Riak storage backend for a given partition will >> > send a message to bitcask, requesting that it queue up a possible >> > merge job. (only one partition will be in the merge process at once >> > as a result of that queue) The bitcask application will examine that >> > partition when that request reaches the front of the queue. If any of >> > the trigger values have been exceeded, then all of the files in that >> > partition which exceed any threshold values will be run through >> > compaction. >> > >> > This allows you a great deal of flexibility in your demands, and also >> > provides reasonable amortization of the cost since each partition is >> > processed independently. >> > >> > -Justin >> > >> > _______________________________________________ >> > riak-users mailing list >> > riak-users@lists.basho.com >> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> > >> >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com