Re: Is my cluster normal?

Yuan Fang Wed, 13 Jul 2016 10:33:53 -0700

$nodetool tpstats

...
Pool Name                               Active   Pending   Completed
Blocked      All time blocked
Native-Transport-Requests       128       128        1420623949         1
      142821509
...




What is this? Is it normal?

On Tue, Jul 12, 2016 at 3:03 PM, Yuan Fang <[email protected]> wrote:

> Hi Jonathan,
>
> Here is the result:
>
> ubuntu@ip-172-31-44-250:~$ iostat -dmx 2 10
> Linux 3.13.0-74-generic (ip-172-31-44-250) 07/12/2016 _x86_64_ (4 CPU)
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> xvda              0.01     2.13    0.74    1.55     0.01     0.02    27.77
>     0.00    0.74    0.89    0.66   0.43   0.10
> xvdf              0.01     0.58  237.41   52.50    12.90     6.21   135.02
>     2.32    8.01    3.65   27.72   0.57  16.63
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> xvda              0.00     7.50    0.00    2.50     0.00     0.04    32.00
>     0.00    1.60    0.00    1.60   1.60   0.40
> xvdf              0.00     0.00  353.50    0.00    24.12     0.00   139.75
>     0.49    1.37    1.37    0.00   0.58  20.60
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> xvda              0.00     0.00    0.00    1.00     0.00     0.00     8.00
>     0.00    0.00    0.00    0.00   0.00   0.00
> xvdf              0.00     2.00  463.50   35.00    30.69     2.86   137.84
>     0.88    1.77    1.29    8.17   0.60  30.00
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> xvda              0.00     0.00    0.00    1.00     0.00     0.00     8.00
>     0.00    0.00    0.00    0.00   0.00   0.00
> xvdf              0.00     0.00   99.50   36.00     8.54     4.40   195.62
>     1.55    3.88    1.45   10.61   1.06  14.40
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> xvda              0.00     5.00    0.00    1.50     0.00     0.03    34.67
>     0.00    1.33    0.00    1.33   1.33   0.20
> xvdf              0.00     1.50  703.00  195.00    48.83    23.76   165.57
>     6.49    8.36    1.66   32.51   0.55  49.80
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> xvda              0.00     0.00    0.00    1.00     0.00     0.04    72.00
>     0.00    0.00    0.00    0.00   0.00   0.00
> xvdf              0.00     2.50  149.50   69.50    10.12     6.68   157.14
>     0.74    3.42    1.18    8.23   0.51  11.20
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> xvda              0.00     5.00    0.00    2.50     0.00     0.03    24.00
>     0.00    0.00    0.00    0.00   0.00   0.00
> xvdf              0.00     0.00   61.50   22.50     5.36     2.75   197.64
>     0.33    3.93    1.50   10.58   0.88   7.40
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> xvda              0.00     0.00    0.00    0.50     0.00     0.00     8.00
>     0.00    0.00    0.00    0.00   0.00   0.00
> xvdf              0.00     0.00  375.00    0.00    24.84     0.00   135.64
>     0.45    1.20    1.20    0.00   0.57  21.20
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> xvda              0.00     1.00    0.00    6.00     0.00     0.03     9.33
>     0.00    0.00    0.00    0.00   0.00   0.00
> xvdf              0.00     0.00  542.50   23.50    35.08     2.83   137.16
>     0.80    1.41    1.15    7.23   0.49  28.00
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> xvda              0.00     3.50    0.50    1.50     0.00     0.02    24.00
>     0.00    0.00    0.00    0.00   0.00   0.00
> xvdf              0.00     1.50  272.00  153.50    16.18    18.67   167.73
>    14.32   33.66    1.39   90.84   0.81  34.60
>
>
>
> On Tue, Jul 12, 2016 at 12:34 PM, Jonathan Haddad <[email protected]>
> wrote:
>
>> When you have high system load it means your CPU is waiting for
>> *something*, and in my experience it's usually slow disk.  A disk connected
>> over network has been a culprit for me many times.
>>
>> On Tue, Jul 12, 2016 at 12:33 PM Jonathan Haddad <[email protected]>
>> wrote:
>>
>>> Can do you do:
>>>
>>> iostat -dmx 2 10
>>>
>>>
>>>
>>> On Tue, Jul 12, 2016 at 11:20 AM Yuan Fang <[email protected]>
>>> wrote:
>>>
>>>> Hi Jeff,
>>>>
>>>> The read being low is because we do not have much read operations right
>>>> now.
>>>>
>>>> The heap is only 4GB.
>>>>
>>>> MAX_HEAP_SIZE=4GB
>>>>
>>>> On Thu, Jul 7, 2016 at 7:17 PM, Jeff Jirsa <[email protected]>
>>>> wrote:
>>>>
>>>>> EBS iops scale with volume size.
>>>>>
>>>>>
>>>>>
>>>>> A 600G EBS volume only guarantees 1800 iops – if you’re exhausting
>>>>> those on writes, you’re going to suffer on reads.
>>>>>
>>>>>
>>>>>
>>>>> You have a 16G server, and probably a good chunk of that allocated to
>>>>> heap. Consequently, you have almost no page cache, so your reads are going
>>>>> to hit the disk. Your reads being very low is not uncommon if you have no
>>>>> page cache – the default settings for Cassandra (64k compression chunks)
>>>>> are really inefficient for small reads served off of disk. If you drop the
>>>>> compression chunk size (4k, for example), you’ll probably see your read
>>>>> throughput increase significantly, which will give you more iops for
>>>>> commitlog, so write throughput likely goes up, too.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *From: *Jonathan Haddad <[email protected]>
>>>>> *Reply-To: *"[email protected]" <[email protected]>
>>>>> *Date: *Thursday, July 7, 2016 at 6:54 PM
>>>>> *To: *"[email protected]" <[email protected]>
>>>>> *Subject: *Re: Is my cluster normal?
>>>>>
>>>>>
>>>>>
>>>>> What's your CPU looking like? If it's low, check your IO with iostat
>>>>> or dstat. I know some people have used Ebs and say it's fine but ive been
>>>>> burned too many times.
>>>>>
>>>>> On Thu, Jul 7, 2016 at 6:12 PM Yuan Fang <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Hi Riccardo,
>>>>>
>>>>>
>>>>>
>>>>> Very low IO-wait. About 0.3%.
>>>>>
>>>>> No stolen CPU. It is a casssandra only instance. I did not see any
>>>>> dropped messages.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ubuntu@cassandra1:/mnt/data$ nodetool tpstats
>>>>>
>>>>> Pool Name                    Active   Pending      Completed   Blocked
>>>>>  All time blocked
>>>>>
>>>>> MutationStage                     1         1      929509244         0
>>>>>                 0
>>>>>
>>>>> ViewMutationStage                 0         0              0         0
>>>>>                 0
>>>>>
>>>>> ReadStage                         4         0        4021570         0
>>>>>                 0
>>>>>
>>>>> RequestResponseStage              0         0      731477999         0
>>>>>                 0
>>>>>
>>>>> ReadRepairStage                   0         0         165603         0
>>>>>                 0
>>>>>
>>>>> CounterMutationStage              0         0              0         0
>>>>>                 0
>>>>>
>>>>> MiscStage                         0         0              0         0
>>>>>                 0
>>>>>
>>>>> CompactionExecutor                2        55          92022         0
>>>>>                 0
>>>>>
>>>>> MemtableReclaimMemory             0         0           1736         0
>>>>>                 0
>>>>>
>>>>> PendingRangeCalculator            0         0              6         0
>>>>>                 0
>>>>>
>>>>> GossipStage                       0         0         345474         0
>>>>>                 0
>>>>>
>>>>> SecondaryIndexManagement          0         0              0         0
>>>>>                 0
>>>>>
>>>>> HintsDispatcher                   0         0              4         0
>>>>>                 0
>>>>>
>>>>> MigrationStage                    0         0             35         0
>>>>>                 0
>>>>>
>>>>> MemtablePostFlush                 0         0           1973         0
>>>>>                 0
>>>>>
>>>>> ValidationExecutor                0         0              0         0
>>>>>                 0
>>>>>
>>>>> Sampler                           0         0              0         0
>>>>>                 0
>>>>>
>>>>> MemtableFlushWriter               0         0           1736         0
>>>>>                 0
>>>>>
>>>>> InternalResponseStage             0         0           5311         0
>>>>>                 0
>>>>>
>>>>> AntiEntropyStage                  0         0              0         0
>>>>>                 0
>>>>>
>>>>> CacheCleanupExecutor              0         0              0         0
>>>>>                 0
>>>>>
>>>>> Native-Transport-Requests       128       128      347508531         2
>>>>>          15891862
>>>>>
>>>>>
>>>>>
>>>>> Message type           Dropped
>>>>>
>>>>> READ                         0
>>>>>
>>>>> RANGE_SLICE                  0
>>>>>
>>>>> _TRACE                       0
>>>>>
>>>>> HINT                         0
>>>>>
>>>>> MUTATION                     0
>>>>>
>>>>> COUNTER_MUTATION             0
>>>>>
>>>>> BATCH_STORE                  0
>>>>>
>>>>> BATCH_REMOVE                 0
>>>>>
>>>>> REQUEST_RESPONSE             0
>>>>>
>>>>> PAGED_RANGE                  0
>>>>>
>>>>> READ_REPAIR                  0
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 7, 2016 at 5:24 PM, Riccardo Ferrari <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Hi Yuan,
>>>>>
>>>>>
>>>>>
>>>>> You machine instance is 4 vcpus that is 4 threads (not cores!!!),
>>>>> aside from any Cassandra specific discussion a system load of 10 on a 4
>>>>> threads machine is way too much in my opinion. If that is the running
>>>>> average system load I would look deeper into system details. Is that IO
>>>>> wait? Is that CPU Stolen? Is that a Cassandra only instance or are there
>>>>> other processes pushing the load?
>>>>>
>>>>> What does your "nodetool tpstats" say? Hoe many dropped messages do
>>>>> you have?
>>>>>
>>>>>
>>>>>
>>>>> Best,
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 8, 2016 at 12:34 AM, Yuan Fang <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Thanks Ben! For the post, it seems they got a little better but
>>>>> similar result than i did. Good to know it.
>>>>>
>>>>> I am not sure if a little fine tuning of heap memory will help or
>>>>> not.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 7, 2016 at 2:58 PM, Ben Slater <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Hi Yuan,
>>>>>
>>>>>
>>>>>
>>>>> You might find this blog post a useful comparison:
>>>>>
>>>>>
>>>>> https://www.instaclustr.com/blog/2016/01/07/multi-data-center-apache-spark-and-apache-cassandra-benchmark/
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.instaclustr.com_blog_2016_01_07_multi-2Ddata-2Dcenter-2Dapache-2Dspark-2Dand-2Dapache-2Dcassandra-2Dbenchmark_&d=CwMFaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=Ltg5YUTZbI4Ixf7UjzKW636Llz6zXXurTveCLptZwio&s=MU4-NWBjvVO95HnxQtkYk4xkApq4X4IiVy8tPCgj4KU&e=>
>>>>>
>>>>>
>>>>>
>>>>> Although the focus is on Spark and Cassandra and multi-DC there are
>>>>> also some single DC benchmarks of m4.xl
>>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__m4.xl&d=CwQFaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=Ltg5YUTZbI4Ixf7UjzKW636Llz6zXXurTveCLptZwio&s=m3DfZk3YOaf0W2OvACsqDWXp-vdlkP-cC0WnEouZwkk&e=>
>>>>> clusters plus some discussion of how we went about benchmarking.
>>>>>
>>>>>
>>>>>
>>>>> Cheers
>>>>>
>>>>> Ben
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, 8 Jul 2016 at 07:52 Yuan Fang <[email protected]> wrote:
>>>>>
>>>>> Yes, here is my stress test result:
>>>>>
>>>>> Results:
>>>>>
>>>>> op rate                   : 12200 [WRITE:12200]
>>>>>
>>>>> partition rate            : 12200 [WRITE:12200]
>>>>>
>>>>> row rate                  : 12200 [WRITE:12200]
>>>>>
>>>>> latency mean              : 16.4 [WRITE:16.4]
>>>>>
>>>>> latency median            : 7.1 [WRITE:7.1]
>>>>>
>>>>> latency 95th percentile   : 38.1 [WRITE:38.1]
>>>>>
>>>>> latency 99th percentile   : 204.3 [WRITE:204.3]
>>>>>
>>>>> latency 99.9th percentile : 465.9 [WRITE:465.9]
>>>>>
>>>>> latency max               : 1408.4 [WRITE:1408.4]
>>>>>
>>>>> Total partitions          : 1000000 [WRITE:1000000]
>>>>>
>>>>> Total errors              : 0 [WRITE:0]
>>>>>
>>>>> total gc count            : 0
>>>>>
>>>>> total gc mb               : 0
>>>>>
>>>>> total gc time (s)         : 0
>>>>>
>>>>> avg gc time(ms)           : NaN
>>>>>
>>>>> stdev gc time(ms)         : 0
>>>>>
>>>>> Total operation time      : 00:01:21
>>>>>
>>>>> END
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 7, 2016 at 2:49 PM, Ryan Svihla <[email protected]> wrote:
>>>>>
>>>>> Lots of variables you're leaving out.
>>>>>
>>>>>
>>>>>
>>>>> Depends on write size, if you're using logged batch or not, what
>>>>> consistency level, what RF, if the writes come in bursts, etc, etc.
>>>>> However, that's all sort of moot for determining "normal" really you need 
>>>>> a
>>>>> baseline as all those variables end up mattering a huge amount.
>>>>>
>>>>>
>>>>>
>>>>> I would suggest using Cassandra stress as a baseline and go from there
>>>>> depending on what those numbers say (just pick the defaults).
>>>>>
>>>>> Sent from my iPhone
>>>>>
>>>>>
>>>>> On Jul 7, 2016, at 4:39 PM, Yuan Fang <[email protected]> wrote:
>>>>>
>>>>> yes, it is about 8k writes per node.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 7, 2016 at 2:18 PM, daemeon reiydelle <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Are you saying 7k writes per node? or 30k writes per node?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *.......Daemeon C.M. ReiydelleUSA (+1) 415.501.0198
>>>>> <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872
>>>>> <%28%2B44%29%20%280%29%2020%208144%209872>*
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 7, 2016 at 2:05 PM, Yuan Fang <[email protected]>
>>>>> wrote:
>>>>>
>>>>> writes 30k/second is the main thing.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 7, 2016 at 1:51 PM, daemeon reiydelle <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Assuming you meant 100k, that likely for something with 16mb of
>>>>> storage (probably way small) where the data is more that 64k hence will 
>>>>> not
>>>>> fit into the row cache.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *.......Daemeon C.M. ReiydelleUSA (+1) 415.501.0198
>>>>> <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872
>>>>> <%28%2B44%29%20%280%29%2020%208144%209872>*
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 7, 2016 at 1:25 PM, Yuan Fang <[email protected]>
>>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> I have a cluster of 4 m4.xlarge nodes(4 cpus and 16 gb memory and
>>>>> 600GB ssd EBS).
>>>>>
>>>>> I can reach a cluster wide write requests of 30k/second and read
>>>>> request about 100/second. The cluster OS load constantly above 10. Are
>>>>> those normal?
>>>>>
>>>>>
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Best,
>>>>>
>>>>>
>>>>>
>>>>> Yuan
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> ————————
>>>>>
>>>>> Ben Slater
>>>>>
>>>>> Chief Product Officer
>>>>>
>>>>> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
>>>>>
>>>>> +61 437 929 798
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>

Re: Is my cluster normal?

Reply via email to