Hi, I have a 6 node 0.7.4 cluster with replication_factor=3 where "nodetool repair keyspace" behaves really strange.
The keyspace contains three column families and about 60GB data in total (i.e 30GB on each node). Even though no data has been added or deleted since the last repair, a repair takes hours and the repairing node seems to receive 100+GB worth of sstable data from its neighbourhood nodes, i.e several times the actual data size. The log says things like: "Performing streaming repair of 27 ranges" And a bunch of: "Compacted to <filename> 22,208,983,964 to 4,816,514,033 (~21% of original)" In the end the repair finishes without any error after a few hours but even then the active sstables seems to contain lots of redundant data since the disk usage can be sliced in half by triggering a major compaction. All this leads me to believe that something stops the AES from correctly figuring out what data is already on the repairing node and what needs to be streamed from the neighbours. The only thing I can think of right now is that one of the column families contains a lot of large rows that are larger than memtable_throughput and that's perhaps what's confusing the merkle tree. Anyway, is this a known problem of perhaps expected behaviour? Otherwise I'll try to create a more reproducible test case. Regards, Jonas