Sounds like https://issues.apache.org/jira/browse/CASSANDRA-2324

On Mon, Apr 4, 2011 at 7:46 AM, aaron morton <aa...@thelastpickle.com> wrote:
> Jonas, AFAIK if repair completed successfully there should be no streaming 
> the next time round. This sounds odd please look into it if you can.
>
> Can you run at DEBUG logging, there will be some messages about receiving 
> streams from files and which ranges are being requested.
>
> I would be interested to know if the repair is completing successfully. You 
> should see messages such as "Repair session blah completed successfully"  if 
> it is. It is possible repair to hang if one of the neighbours goes away or 
> fails to send the data. In this case the repair session will timeout after 48 
> hours.
>
> Aaron
>
> On 4 Apr 2011, at 20:39, Roland Gude wrote:
>
>> I am experiencing the same behavior but had it on previous versions of 0.7 
>> as well.
>>
>>
>> -----Ursprüngliche Nachricht-----
>> Von: Jonas Borgström [mailto:jonas.borgst...@trioptima.com]
>> Gesendet: Montag, 4. April 2011 12:26
>> An: user@cassandra.apache.org
>> Betreff: Strange nodetool repair behaviour
>>
>> Hi,
>>
>> I have a 6 node 0.7.4 cluster with replication_factor=3 where "nodetool
>> repair keyspace" behaves really strange.
>>
>> The keyspace contains three column families and about 60GB data in total
>> (i.e 30GB on each node).
>>
>> Even though no data has been added or deleted since the last repair, a
>> repair takes hours and the repairing node seems to receive 100+GB worth
>> of sstable data from its neighbourhood nodes, i.e several times the
>> actual data size.
>>
>> The log says things like:
>>
>> "Performing streaming repair of 27 ranges"
>>
>> And a bunch of:
>>
>> "Compacted to <filename> 22,208,983,964 to 4,816,514,033 (~21% of original)"
>>
>> In the end the repair finishes without any error after a few hours but
>> even then the active sstables seems to contain lots of redundant data
>> since the disk usage can be sliced in half by triggering a major compaction.
>>
>> All this leads me to believe that something stops the AES from correctly
>> figuring out what data is already on the repairing node and what needs
>> to be streamed from the neighbours.
>>
>> The only thing I can think of right now is that one of the column
>> families contains a lot of large rows that are larger than
>> memtable_throughput and that's perhaps what's confusing the merkle tree.
>>
>> Anyway, is this a known problem of perhaps expected behaviour?
>> Otherwise I'll try to create a more reproducible test case.
>>
>> Regards,
>> Jonas
>>
>>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Reply via email to