Re: Is my cluster normal?

Yuan Fang Thu, 07 Jul 2016 18:13:06 -0700

Hi Riccardo,

Very low IO-wait. About 0.3%.
No stolen CPU. It is a casssandra only instance. I did not see any dropped
messages.



ubuntu@cassandra1:/mnt/data$ nodetool tpstats
Pool Name                    Active   Pending      Completed   Blocked  All
time blocked
MutationStage                     1         1      929509244         0
            0
ViewMutationStage                 0         0              0         0
            0
ReadStage                         4         0        4021570         0
            0
RequestResponseStage              0         0      731477999         0
            0
ReadRepairStage                   0         0         165603         0
            0
CounterMutationStage              0         0              0         0
            0
MiscStage                         0         0              0         0
            0
CompactionExecutor                2        55          92022         0
            0
MemtableReclaimMemory             0         0           1736         0
            0
PendingRangeCalculator            0         0              6         0
            0
GossipStage                       0         0         345474         0
            0
SecondaryIndexManagement          0         0              0         0
            0
HintsDispatcher                   0         0              4         0
            0
MigrationStage                    0         0             35         0
            0
MemtablePostFlush                 0         0           1973         0
            0
ValidationExecutor                0         0              0         0
            0
Sampler                           0         0              0         0
            0
MemtableFlushWriter               0         0           1736         0
            0
InternalResponseStage             0         0           5311         0
            0
AntiEntropyStage                  0         0              0         0
            0
CacheCleanupExecutor              0         0              0         0
            0
Native-Transport-Requests       128       128      347508531         2
     15891862

Message type           Dropped
READ                         0
RANGE_SLICE                  0
_TRACE                       0
HINT                         0
MUTATION                     0
COUNTER_MUTATION             0
BATCH_STORE                  0
BATCH_REMOVE                 0
REQUEST_RESPONSE             0
PAGED_RANGE                  0
READ_REPAIR                  0





On Thu, Jul 7, 2016 at 5:24 PM, Riccardo Ferrari <[email protected]> wrote:

> Hi Yuan,
>
> You machine instance is 4 vcpus that is 4 threads (not cores!!!), aside
> from any Cassandra specific discussion a system load of 10 on a 4 threads
> machine is way too much in my opinion. If that is the running average
> system load I would look deeper into system details. Is that IO wait? Is
> that CPU Stolen? Is that a Cassandra only instance or are there other
> processes pushing the load?
> What does your "nodetool tpstats" say? Hoe many dropped messages do you
> have?
>
> Best,
>
> On Fri, Jul 8, 2016 at 12:34 AM, Yuan Fang <[email protected]> wrote:
>
>> Thanks Ben! For the post, it seems they got a little better but similar
>> result than i did. Good to know it.
>> I am not sure if a little fine tuning of heap memory will help or not.
>>
>>
>> On Thu, Jul 7, 2016 at 2:58 PM, Ben Slater <[email protected]>
>> wrote:
>>
>>> Hi Yuan,
>>>
>>> You might find this blog post a useful comparison:
>>>
>>> https://www.instaclustr.com/blog/2016/01/07/multi-data-center-apache-spark-and-apache-cassandra-benchmark/
>>>
>>> Although the focus is on Spark and Cassandra and multi-DC there are also
>>> some single DC benchmarks of m4.xl clusters plus some discussion of how we
>>> went about benchmarking.
>>>
>>> Cheers
>>> Ben
>>>
>>>
>>> On Fri, 8 Jul 2016 at 07:52 Yuan Fang <[email protected]> wrote:
>>>
>>>> Yes, here is my stress test result:
>>>> Results:
>>>> op rate                   : 12200 [WRITE:12200]
>>>> partition rate            : 12200 [WRITE:12200]
>>>> row rate                  : 12200 [WRITE:12200]
>>>> latency mean              : 16.4 [WRITE:16.4]
>>>> latency median            : 7.1 [WRITE:7.1]
>>>> latency 95th percentile   : 38.1 [WRITE:38.1]
>>>> latency 99th percentile   : 204.3 [WRITE:204.3]
>>>> latency 99.9th percentile : 465.9 [WRITE:465.9]
>>>> latency max               : 1408.4 [WRITE:1408.4]
>>>> Total partitions          : 1000000 [WRITE:1000000]
>>>> Total errors              : 0 [WRITE:0]
>>>> total gc count            : 0
>>>> total gc mb               : 0
>>>> total gc time (s)         : 0
>>>> avg gc time(ms)           : NaN
>>>> stdev gc time(ms)         : 0
>>>> Total operation time      : 00:01:21
>>>> END
>>>>
>>>> On Thu, Jul 7, 2016 at 2:49 PM, Ryan Svihla <[email protected]> wrote:
>>>>
>>>>> Lots of variables you're leaving out.
>>>>>
>>>>> Depends on write size, if you're using logged batch or not, what
>>>>> consistency level, what RF, if the writes come in bursts, etc, etc.
>>>>> However, that's all sort of moot for determining "normal" really you need 
>>>>> a
>>>>> baseline as all those variables end up mattering a huge amount.
>>>>>
>>>>> I would suggest using Cassandra stress as a baseline and go from there
>>>>> depending on what those numbers say (just pick the defaults).
>>>>>
>>>>> Sent from my iPhone
>>>>>
>>>>> On Jul 7, 2016, at 4:39 PM, Yuan Fang <[email protected]> wrote:
>>>>>
>>>>> yes, it is about 8k writes per node.
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 7, 2016 at 2:18 PM, daemeon reiydelle <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Are you saying 7k writes per node? or 30k writes per node?
>>>>>>
>>>>>>
>>>>>> *.......*
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198
>>>>>> <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872
>>>>>> <%28%2B44%29%20%280%29%2020%208144%209872>*
>>>>>>
>>>>>> On Thu, Jul 7, 2016 at 2:05 PM, Yuan Fang <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> writes 30k/second is the main thing.
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jul 7, 2016 at 1:51 PM, daemeon reiydelle <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Assuming you meant 100k, that likely for something with 16mb of
>>>>>>>> storage (probably way small) where the data is more that 64k hence 
>>>>>>>> will not
>>>>>>>> fit into the row cache.
>>>>>>>>
>>>>>>>>
>>>>>>>> *.......*
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198
>>>>>>>> <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872
>>>>>>>> <%28%2B44%29%20%280%29%2020%208144%209872>*
>>>>>>>>
>>>>>>>> On Thu, Jul 7, 2016 at 1:25 PM, Yuan Fang <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I have a cluster of 4 m4.xlarge nodes(4 cpus and 16 gb memory and
>>>>>>>>> 600GB ssd EBS).
>>>>>>>>> I can reach a cluster wide write requests of 30k/second and read
>>>>>>>>> request about 100/second. The cluster OS load constantly above 10. Are
>>>>>>>>> those normal?
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>>
>>>>>>>>> Yuan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>> --
>>> ————————
>>> Ben Slater
>>> Chief Product Officer
>>> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
>>> +61 437 929 798
>>>
>>
>>
>

Re: Is my cluster normal?

Reply via email to