Strange nodetool repair behaviour

Jonas Borgström Mon, 04 Apr 2011 03:26:36 -0700

Hi,

I have a 6 node 0.7.4 cluster with replication_factor=3 where "nodetool
repair keyspace" behaves really strange.


The keyspace contains three column families and about 60GB data in total
(i.e 30GB on each node).

Even though no data has been added or deleted since the last repair, a
repair takes hours and the repairing node seems to receive 100+GB worth
of sstable data from its neighbourhood nodes, i.e several times the
actual data size.

The log says things like:

"Performing streaming repair of 27 ranges"

And a bunch of:

"Compacted to <filename> 22,208,983,964 to 4,816,514,033 (~21% of original)"

In the end the repair finishes without any error after a few hours but
even then the active sstables seems to contain lots of redundant data
since the disk usage can be sliced in half by triggering a major compaction.

All this leads me to believe that something stops the AES from correctly
figuring out what data is already on the repairing node and what needs
to be streamed from the neighbours.

The only thing I can think of right now is that one of the column
families contains a lot of large rows that are larger than
memtable_throughput and that's perhaps what's confusing the merkle tree.

Anyway, is this a known problem of perhaps expected behaviour?
Otherwise I'll try to create a more reproducible test case.

Regards,
Jonas

Strange nodetool repair behaviour

Reply via email to