Any news on this ?
We also have issues during repairs when using many LCS tables. We end
up with 8k sstables, many pending tasks and dropped mutations
We are using Cassandra 2.0.10, on a 24 cores server, with
multithreaded compactions enabled.
~$ nodetool getstreamthroughput
Current stream throughput: 200 MB/s
~$ nodetool getcompactionthroughput
Current compaction throughput: 16 MB/s
Most sstables are tiny 4K or 8K/12K sstables:
~$ ls -sh /var/lib/cassandra/data/xxxx/xxx/*-Data.db | grep -Ev 'M' | wc -l
7405
~$ ls -sh /var/lib/cassandra/data/xxxx/xxx/*-Data.db | wc -l
7440
~$ ls -sh /var/lib/cassandra/data/xxxx/xxx/*-Data.db | grep -Ev 'M' |
cut -f1 -d" " | sort | uniq -c
36
7003 4.0K
396 8.0K
Pool Name Active Pending Completed Blocked
All time blocked
ReadStage 0 0 258098148 0
0
RequestResponseStage 0 0 613994884 0
0
MutationStage 0 0 332242206 0
0
ReadRepairStage 0 0 3360040 0
0
ReplicateOnWriteStage 0 0 0 0
0
GossipStage 0 0 2471033 0
0
CacheCleanupExecutor 0 0 0 0
0
MigrationStage 0 0 0 0
0
MemoryMeter 0 0 25160 0
0
FlushWriter 1 1 134083 0
521
ValidationExecutor 1 1 89514 0
0
InternalResponseStage 0 0 0 0
0
AntiEntropyStage 0 0 636471 0
0
MemtablePostFlusher 1 1 334667 0
0
MiscStage 0 0 0 0
0
PendingRangeCalculator 0 0 181 0
0
commitlog_archiver 0 0 0 0
0
CompactionExecutor 24 24 5241768 0
0
AntiEntropySessions 0 0 15184 0
0
HintedHandoff 0 0 278 0
0
Message type Dropped
RANGE_SLICE 0
READ_REPAIR 267
PAGED_RANGE 0
BINARY 0
READ 0
MUTATION 150970
_TRACE 0
REQUEST_RESPONSE 0
COUNTER_MUTATION 0
2016-02-12 20:08 GMT+01:00 Michał Łowicki <[email protected]>:
> I had to decrease streaming throughput to 10 (from default 200) in order to
> avoid effect or rising number of SSTables and number of compaction tasks
> while running repair. It's working very slow but it's stable and doesn't
> hurt the whole cluster. Will try to adjust configuration gradually to see if
> can make it any better. Thanks!
>
> On Thu, Feb 11, 2016 at 8:10 PM, Michał Łowicki <[email protected]> wrote:
>>
>>
>>
>> On Thu, Feb 11, 2016 at 5:38 PM, Alain RODRIGUEZ <[email protected]>
>> wrote:
>>>
>>> Also, are you using incremental repairs (not sure about the available
>>> options in Spotify Reaper) what command did you run ?
>>>
>>
>> No.
>>
>>>
>>> 2016-02-11 17:33 GMT+01:00 Alain RODRIGUEZ <[email protected]>:
>>>>>
>>>>> CPU load is fine, SSD disks below 30% utilization, no long GC pauses
>>>>
>>>>
>>>>
>>>> What is your current compaction throughput ? The current value of
>>>> 'concurrent_compactors' (cassandra.yaml or through JMX) ?
>>
>>
>>
>> Throughput was initially set to 1024 and I've gradually increased it to
>> 2048, 4K and 16K but haven't seen any changes. Tried to change it both from
>> `nodetool` and also cassandra.yaml (with restart after changes).
>>
>>>>
>>>>
>>>> nodetool getcompactionthroughput
>>>>
>>>>> How to speed up compaction? Increased compaction throughput and
>>>>> concurrent compactors but no change. Seems there is plenty idle resources
>>>>> but can't force C* to use it.
>>>>
>>>>
>>>> You might want to try un-throttle the compaction throughput through:
>>>>
>>>> nodetool setcompactionsthroughput 0
>>>>
>>>> Choose a canari node. Monitor compaction pending and disk throughput
>>>> (make sure server is ok too - CPU...)
>>
>>
>>
>> Yes, I'll try it out but if increasing it 16 times didn't help I'm a bit
>> sceptical about it.
>>
>>>>
>>>>
>>>> Some other information could be useful:
>>>>
>>>> What is your number of cores per machine and the compaction strategies
>>>> for the 'most compacting' tables. What are write/update patterns, any TTL
>>>> or
>>>> tombstones ? Do you use a high number of vnodes ?
>>
>>
>> I'm using bare-metal box, 40CPU, 64GB, 2 SSD each. num_tokens is set to
>> 256.
>>
>> Using LCS for all tables. Write / update heavy. No warnings about large
>> number of tombstones but we're removing items frequently.
>>
>>
>>>>
>>>>
>>>> Also what is your repair routine and your values for gc_grace_seconds ?
>>>> When was your last repair and do you think your cluster is suffering of a
>>>> high entropy ?
>>
>>
>> We're having problem with repair for months (CASSANDRA-9935).
>> gc_grace_seconds is set to 345600 now. Yes, as we haven't launched it
>> successfully for long time I guess cluster is suffering of high entropy.
>>
>>>>
>>>>
>>>> You can lower the stream throughput to make sure nodes can cope with
>>>> what repairs are feeding them.
>>>>
>>>> nodetool getstreamthroughput
>>>> nodetool setstreamthroughput X
>>
>>
>> Yes, this sounds interesting. As we're having problem with repair for
>> months it could that lots of things are transferred between nodes.
>>
>> Thanks!
>>
>>>>
>>>>
>>>> C*heers,
>>>>
>>>> -----------------
>>>> Alain Rodriguez
>>>> France
>>>>
>>>> The Last Pickle
>>>> http://www.thelastpickle.com
>>>>
>>>> 2016-02-11 16:55 GMT+01:00 Michał Łowicki <[email protected]>:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Using 2.1.12 across 3 DCs. Each DC has 8 nodes. Trying to run repair
>>>>> using Cassandra Reaper but nodes after couple of hours are full of pending
>>>>> compaction tasks (regular not the ones about validation)
>>>>>
>>>>> CPU load is fine, SSD disks below 30% utilization, no long GC pauses.
>>>>>
>>>>> How to speed up compaction? Increased compaction throughput and
>>>>> concurrent compactors but no change. Seems there is plenty idle resources
>>>>> but can't force C* to use it.
>>>>>
>>>>> Any clue where there might be a bottleneck?
>>>>>
>>>>>
>>>>> --
>>>>> BR,
>>>>> Michał Łowicki
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>> BR,
>> Michał Łowicki
>
>
>
>
> --
> BR,
> Michał Łowicki
--
Close the World, Open the Net
http://www.linux-wizard.net