Agreed, I’d rather discuss the details on JIRA. It might be nice to send another email describing whatever conclusion we come to, after we have everything hashed out.
> On Aug 24, 2016, at 4:09 PM, Paulo Motta <pauloricard...@gmail.com> wrote: > > Thanks for sharing this! I added some comments/suggestions on the ticket > for those interested. > > On a side note, it's still not clear if we should do the discussion here on > the dev-list or just call attention for a particular issue/ticket and then > continue discussion on JIRA, but I find the latter more appropriate to > avoid spamming those not interested, and only update here if there are new > developments in the ticket direction. > > 2016-08-24 18:35 GMT-03:00 Blake Eggleston <beggles...@apple.com>: > >> Hi everyone, >> >> I just posted a proposed solution to some issues with incremental repair >> in CASSANDRA-9143. The solution involves non-trivial changes to the way >> incremental repair works, so I’m giving it a shout out on the dev list in >> the spirit of increasing the flow of information here. >> >> Summary of problem: >> >> Anticompaction excludes sstables that have been, or are, compacting. >> Anticompactions can also fail on a single machine due to any number of >> reasons. In either of these scenarios, a potentially large amount of data >> will be marked as unrepaired on one machine that’s marked as repaired on >> the others. During the next incremental repair, this potentially large >> amount of data will be unnecessarily streamed out to the other nodes, >> because it won’t be in their unrepaired data. >> >> Proposed solution: >> >> Add a ‘pending repair’ bucket to the existing repaired and unrepaired >> sstable buckets. We do the anticompaction up front, but put the >> anticompacted data into the pending bucket. From here, the repair proceeds >> normally against the pending sstables, with the streamed sstables also >> going into the pending buckets. Once all nodes have completed streaming, >> the pending sstables are moved into the repaired bucket, or back into >> unrepaired if there’s a failure. >> >> - Blake