On Tue, Nov 6, 2012 at 8:27 AM, horschi <hors...@gmail.com> wrote: > > >> it is a big itch for my use case. Repair ends up streaming tens of >> gigabytes of data which has expired TTL and has been compacted away on some >> nodes but not yet on others. The wasted work is not nice plus it drives up >> the memory usage (for bloom filters, indexes, etc) of all nodes since there >> are many more rows to track than planned. Disabling the periodic repair >> lowered the per-node load by 100GB which was all dead data in my case. > > > What is the issue with your setup? Do you use TTLs or do you think its due > to DeletedColumns? Was your intension to push the idea of removing > localDeletionTime from DeletedColumn.updateDigest ? > > > I don't know enough about the code level implementation to comment on the validity of the fix. My main issue is that we use a lot of TTL columns and in many cases all columns have a TTL that is less than gc_grace. The problem arises when the columns are gc-able and are compacted away on one node but not on all replicas, the periodic repair process ends up copying all the garbage columns & rows back to all other replicas. It consumes a lot of repair resources and makes rows stick around for much longer than they really should which consumes even more cluster resources.
-Bryan