On Tue, Nov 6, 2012 at 8:27 AM, horschi <hors...@gmail.com> wrote:

>
>
>> it is a big itch for my use case.  Repair ends up streaming tens of
>> gigabytes of data which has expired TTL and has been compacted away on some
>> nodes but not yet on others.  The wasted work is not nice plus it drives up
>> the memory usage (for bloom filters, indexes, etc) of all nodes since there
>> are many more rows to track than planned.  Disabling the periodic repair
>> lowered the per-node load by 100GB which was all dead data in my case.
>
>
> What is the issue with your setup? Do you use TTLs or do you think its due
> to DeletedColumns?  Was your intension to push the idea of removing
> localDeletionTime from DeletedColumn.updateDigest ?
>
>
>
I don't know enough about the code level implementation to comment on the
validity of the fix.  My main issue is that we use a lot of TTL columns and
in many cases all columns have a TTL that is less than gc_grace.  The
problem arises when the columns are gc-able and are compacted away on one
node but not on all replicas, the periodic repair process ends up copying
all the garbage columns & rows back to all other replicas.  It consumes a
lot of repair resources and makes rows stick around for much longer than
they really should which consumes even more cluster resources.

-Bryan

Reply via email to