http://basho.com/introducing-riak-1-3/
Introduced Active Anti-Entropy. Riak now has active anti-entropy. In distributed systems, inconsistencies can arise between replicas due to failure modes, concurrent updates, and physical data loss or corruption. Pre-1.3 Riak already had several features for repairing this “entropy”, but they all required some form of user intervention. Riak 1.3 introduces automatic, self-healing properties that repair entropy on an ongoing basis. On Wed, May 15, 2013 at 5:32 PM, Robert Coli <rc...@eventbrite.com> wrote: > On Wed, May 15, 2013 at 1:27 AM, Alain RODRIGUEZ <arodr...@gmail.com> > wrote: > > Rob, I was wondering something. Are you a commiter working on improving > the > > repair or something similar ? > > I am not a committer [1], but I have an active interest in potential > improvements to the best practices for repair. The specific change > that I am considering is a modification to the default > gc_grace_seconds value, which seems picked out of a hat at 10 days. My > view is that the current implementation of repair has such negative > performance consequences that I do not believe that holding onto > tombstones for longer than 10 days could possibly be as bad as the > fixed cost of running repair once every 10 days. I believe that this > value is too low for a default (it also does not map cleanly to the > work week!) and likely should be increased to 14, 21 or 28 days. > > > Anyway, if a commiter (or any other expert) could give us some feedback > on > > our comments (Are we doing well or not, whether things we observe are > normal > > or unexplained, what is going to be improved in the future about > repair...) > > 1) you are doing things according to best practice > 2) unfortunately your experience with significantly degraded > performance, including a blocked go-live due to repair bloat is pretty > typical > 3) the things you are experiencing are part of the current > implementation of repair and are also typical, however I do not > believe they are fully "explained" [2] > 4) as has been mentioned further down thread, there are discussions > regarding (and some already committed) improvements to both the > current repair paradigm and an evolution to a new paradigm > > Thanks to all for the responses so far, please keep them coming! :D > > =Rob > [1] hence the (unofficial) tag for this thread. I do have minor > patches accepted to the codebase, but always merged by an actual > committer. :) > [2] driftx@#cassandra feels that these things are explained/understood > by core team, and points to > https://issues.apache.org/jira/browse/CASSANDRA-5280 as a useful > approach to minimize same. >