$nodetool tpstats
...
Pool Name Active Pending Completed
Blocked All time blocked
Native-Transport-Requests 128 128 1420623949 1
142821509
...
What is this? Is it normal?
On Tue, Jul 12, 2016 at 3:03 PM, Yuan Fang <[email protected]> wrote:
> Hi Jonathan,
>
> Here is the result:
>
> ubuntu@ip-172-31-44-250:~$ iostat -dmx 2 10
> Linux 3.13.0-74-generic (ip-172-31-44-250) 07/12/2016 _x86_64_ (4 CPU)
>
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
> avgqu-sz await r_await w_await svctm %util
> xvda 0.01 2.13 0.74 1.55 0.01 0.02 27.77
> 0.00 0.74 0.89 0.66 0.43 0.10
> xvdf 0.01 0.58 237.41 52.50 12.90 6.21 135.02
> 2.32 8.01 3.65 27.72 0.57 16.63
>
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
> avgqu-sz await r_await w_await svctm %util
> xvda 0.00 7.50 0.00 2.50 0.00 0.04 32.00
> 0.00 1.60 0.00 1.60 1.60 0.40
> xvdf 0.00 0.00 353.50 0.00 24.12 0.00 139.75
> 0.49 1.37 1.37 0.00 0.58 20.60
>
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
> avgqu-sz await r_await w_await svctm %util
> xvda 0.00 0.00 0.00 1.00 0.00 0.00 8.00
> 0.00 0.00 0.00 0.00 0.00 0.00
> xvdf 0.00 2.00 463.50 35.00 30.69 2.86 137.84
> 0.88 1.77 1.29 8.17 0.60 30.00
>
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
> avgqu-sz await r_await w_await svctm %util
> xvda 0.00 0.00 0.00 1.00 0.00 0.00 8.00
> 0.00 0.00 0.00 0.00 0.00 0.00
> xvdf 0.00 0.00 99.50 36.00 8.54 4.40 195.62
> 1.55 3.88 1.45 10.61 1.06 14.40
>
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
> avgqu-sz await r_await w_await svctm %util
> xvda 0.00 5.00 0.00 1.50 0.00 0.03 34.67
> 0.00 1.33 0.00 1.33 1.33 0.20
> xvdf 0.00 1.50 703.00 195.00 48.83 23.76 165.57
> 6.49 8.36 1.66 32.51 0.55 49.80
>
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
> avgqu-sz await r_await w_await svctm %util
> xvda 0.00 0.00 0.00 1.00 0.00 0.04 72.00
> 0.00 0.00 0.00 0.00 0.00 0.00
> xvdf 0.00 2.50 149.50 69.50 10.12 6.68 157.14
> 0.74 3.42 1.18 8.23 0.51 11.20
>
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
> avgqu-sz await r_await w_await svctm %util
> xvda 0.00 5.00 0.00 2.50 0.00 0.03 24.00
> 0.00 0.00 0.00 0.00 0.00 0.00
> xvdf 0.00 0.00 61.50 22.50 5.36 2.75 197.64
> 0.33 3.93 1.50 10.58 0.88 7.40
>
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
> avgqu-sz await r_await w_await svctm %util
> xvda 0.00 0.00 0.00 0.50 0.00 0.00 8.00
> 0.00 0.00 0.00 0.00 0.00 0.00
> xvdf 0.00 0.00 375.00 0.00 24.84 0.00 135.64
> 0.45 1.20 1.20 0.00 0.57 21.20
>
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
> avgqu-sz await r_await w_await svctm %util
> xvda 0.00 1.00 0.00 6.00 0.00 0.03 9.33
> 0.00 0.00 0.00 0.00 0.00 0.00
> xvdf 0.00 0.00 542.50 23.50 35.08 2.83 137.16
> 0.80 1.41 1.15 7.23 0.49 28.00
>
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
> avgqu-sz await r_await w_await svctm %util
> xvda 0.00 3.50 0.50 1.50 0.00 0.02 24.00
> 0.00 0.00 0.00 0.00 0.00 0.00
> xvdf 0.00 1.50 272.00 153.50 16.18 18.67 167.73
> 14.32 33.66 1.39 90.84 0.81 34.60
>
>
>
> On Tue, Jul 12, 2016 at 12:34 PM, Jonathan Haddad <[email protected]>
> wrote:
>
>> When you have high system load it means your CPU is waiting for
>> *something*, and in my experience it's usually slow disk. A disk connected
>> over network has been a culprit for me many times.
>>
>> On Tue, Jul 12, 2016 at 12:33 PM Jonathan Haddad <[email protected]>
>> wrote:
>>
>>> Can do you do:
>>>
>>> iostat -dmx 2 10
>>>
>>>
>>>
>>> On Tue, Jul 12, 2016 at 11:20 AM Yuan Fang <[email protected]>
>>> wrote:
>>>
>>>> Hi Jeff,
>>>>
>>>> The read being low is because we do not have much read operations right
>>>> now.
>>>>
>>>> The heap is only 4GB.
>>>>
>>>> MAX_HEAP_SIZE=4GB
>>>>
>>>> On Thu, Jul 7, 2016 at 7:17 PM, Jeff Jirsa <[email protected]>
>>>> wrote:
>>>>
>>>>> EBS iops scale with volume size.
>>>>>
>>>>>
>>>>>
>>>>> A 600G EBS volume only guarantees 1800 iops – if you’re exhausting
>>>>> those on writes, you’re going to suffer on reads.
>>>>>
>>>>>
>>>>>
>>>>> You have a 16G server, and probably a good chunk of that allocated to
>>>>> heap. Consequently, you have almost no page cache, so your reads are going
>>>>> to hit the disk. Your reads being very low is not uncommon if you have no
>>>>> page cache – the default settings for Cassandra (64k compression chunks)
>>>>> are really inefficient for small reads served off of disk. If you drop the
>>>>> compression chunk size (4k, for example), you’ll probably see your read
>>>>> throughput increase significantly, which will give you more iops for
>>>>> commitlog, so write throughput likely goes up, too.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *From: *Jonathan Haddad <[email protected]>
>>>>> *Reply-To: *"[email protected]" <[email protected]>
>>>>> *Date: *Thursday, July 7, 2016 at 6:54 PM
>>>>> *To: *"[email protected]" <[email protected]>
>>>>> *Subject: *Re: Is my cluster normal?
>>>>>
>>>>>
>>>>>
>>>>> What's your CPU looking like? If it's low, check your IO with iostat
>>>>> or dstat. I know some people have used Ebs and say it's fine but ive been
>>>>> burned too many times.
>>>>>
>>>>> On Thu, Jul 7, 2016 at 6:12 PM Yuan Fang <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Hi Riccardo,
>>>>>
>>>>>
>>>>>
>>>>> Very low IO-wait. About 0.3%.
>>>>>
>>>>> No stolen CPU. It is a casssandra only instance. I did not see any
>>>>> dropped messages.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ubuntu@cassandra1:/mnt/data$ nodetool tpstats
>>>>>
>>>>> Pool Name Active Pending Completed Blocked
>>>>> All time blocked
>>>>>
>>>>> MutationStage 1 1 929509244 0
>>>>> 0
>>>>>
>>>>> ViewMutationStage 0 0 0 0
>>>>> 0
>>>>>
>>>>> ReadStage 4 0 4021570 0
>>>>> 0
>>>>>
>>>>> RequestResponseStage 0 0 731477999 0
>>>>> 0
>>>>>
>>>>> ReadRepairStage 0 0 165603 0
>>>>> 0
>>>>>
>>>>> CounterMutationStage 0 0 0 0
>>>>> 0
>>>>>
>>>>> MiscStage 0 0 0 0
>>>>> 0
>>>>>
>>>>> CompactionExecutor 2 55 92022 0
>>>>> 0
>>>>>
>>>>> MemtableReclaimMemory 0 0 1736 0
>>>>> 0
>>>>>
>>>>> PendingRangeCalculator 0 0 6 0
>>>>> 0
>>>>>
>>>>> GossipStage 0 0 345474 0
>>>>> 0
>>>>>
>>>>> SecondaryIndexManagement 0 0 0 0
>>>>> 0
>>>>>
>>>>> HintsDispatcher 0 0 4 0
>>>>> 0
>>>>>
>>>>> MigrationStage 0 0 35 0
>>>>> 0
>>>>>
>>>>> MemtablePostFlush 0 0 1973 0
>>>>> 0
>>>>>
>>>>> ValidationExecutor 0 0 0 0
>>>>> 0
>>>>>
>>>>> Sampler 0 0 0 0
>>>>> 0
>>>>>
>>>>> MemtableFlushWriter 0 0 1736 0
>>>>> 0
>>>>>
>>>>> InternalResponseStage 0 0 5311 0
>>>>> 0
>>>>>
>>>>> AntiEntropyStage 0 0 0 0
>>>>> 0
>>>>>
>>>>> CacheCleanupExecutor 0 0 0 0
>>>>> 0
>>>>>
>>>>> Native-Transport-Requests 128 128 347508531 2
>>>>> 15891862
>>>>>
>>>>>
>>>>>
>>>>> Message type Dropped
>>>>>
>>>>> READ 0
>>>>>
>>>>> RANGE_SLICE 0
>>>>>
>>>>> _TRACE 0
>>>>>
>>>>> HINT 0
>>>>>
>>>>> MUTATION 0
>>>>>
>>>>> COUNTER_MUTATION 0
>>>>>
>>>>> BATCH_STORE 0
>>>>>
>>>>> BATCH_REMOVE 0
>>>>>
>>>>> REQUEST_RESPONSE 0
>>>>>
>>>>> PAGED_RANGE 0
>>>>>
>>>>> READ_REPAIR 0
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 7, 2016 at 5:24 PM, Riccardo Ferrari <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Hi Yuan,
>>>>>
>>>>>
>>>>>
>>>>> You machine instance is 4 vcpus that is 4 threads (not cores!!!),
>>>>> aside from any Cassandra specific discussion a system load of 10 on a 4
>>>>> threads machine is way too much in my opinion. If that is the running
>>>>> average system load I would look deeper into system details. Is that IO
>>>>> wait? Is that CPU Stolen? Is that a Cassandra only instance or are there
>>>>> other processes pushing the load?
>>>>>
>>>>> What does your "nodetool tpstats" say? Hoe many dropped messages do
>>>>> you have?
>>>>>
>>>>>
>>>>>
>>>>> Best,
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 8, 2016 at 12:34 AM, Yuan Fang <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Thanks Ben! For the post, it seems they got a little better but
>>>>> similar result than i did. Good to know it.
>>>>>
>>>>> I am not sure if a little fine tuning of heap memory will help or
>>>>> not.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 7, 2016 at 2:58 PM, Ben Slater <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Hi Yuan,
>>>>>
>>>>>
>>>>>
>>>>> You might find this blog post a useful comparison:
>>>>>
>>>>>
>>>>> https://www.instaclustr.com/blog/2016/01/07/multi-data-center-apache-spark-and-apache-cassandra-benchmark/
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.instaclustr.com_blog_2016_01_07_multi-2Ddata-2Dcenter-2Dapache-2Dspark-2Dand-2Dapache-2Dcassandra-2Dbenchmark_&d=CwMFaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=Ltg5YUTZbI4Ixf7UjzKW636Llz6zXXurTveCLptZwio&s=MU4-NWBjvVO95HnxQtkYk4xkApq4X4IiVy8tPCgj4KU&e=>
>>>>>
>>>>>
>>>>>
>>>>> Although the focus is on Spark and Cassandra and multi-DC there are
>>>>> also some single DC benchmarks of m4.xl
>>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__m4.xl&d=CwQFaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=Ltg5YUTZbI4Ixf7UjzKW636Llz6zXXurTveCLptZwio&s=m3DfZk3YOaf0W2OvACsqDWXp-vdlkP-cC0WnEouZwkk&e=>
>>>>> clusters plus some discussion of how we went about benchmarking.
>>>>>
>>>>>
>>>>>
>>>>> Cheers
>>>>>
>>>>> Ben
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, 8 Jul 2016 at 07:52 Yuan Fang <[email protected]> wrote:
>>>>>
>>>>> Yes, here is my stress test result:
>>>>>
>>>>> Results:
>>>>>
>>>>> op rate : 12200 [WRITE:12200]
>>>>>
>>>>> partition rate : 12200 [WRITE:12200]
>>>>>
>>>>> row rate : 12200 [WRITE:12200]
>>>>>
>>>>> latency mean : 16.4 [WRITE:16.4]
>>>>>
>>>>> latency median : 7.1 [WRITE:7.1]
>>>>>
>>>>> latency 95th percentile : 38.1 [WRITE:38.1]
>>>>>
>>>>> latency 99th percentile : 204.3 [WRITE:204.3]
>>>>>
>>>>> latency 99.9th percentile : 465.9 [WRITE:465.9]
>>>>>
>>>>> latency max : 1408.4 [WRITE:1408.4]
>>>>>
>>>>> Total partitions : 1000000 [WRITE:1000000]
>>>>>
>>>>> Total errors : 0 [WRITE:0]
>>>>>
>>>>> total gc count : 0
>>>>>
>>>>> total gc mb : 0
>>>>>
>>>>> total gc time (s) : 0
>>>>>
>>>>> avg gc time(ms) : NaN
>>>>>
>>>>> stdev gc time(ms) : 0
>>>>>
>>>>> Total operation time : 00:01:21
>>>>>
>>>>> END
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 7, 2016 at 2:49 PM, Ryan Svihla <[email protected]> wrote:
>>>>>
>>>>> Lots of variables you're leaving out.
>>>>>
>>>>>
>>>>>
>>>>> Depends on write size, if you're using logged batch or not, what
>>>>> consistency level, what RF, if the writes come in bursts, etc, etc.
>>>>> However, that's all sort of moot for determining "normal" really you need
>>>>> a
>>>>> baseline as all those variables end up mattering a huge amount.
>>>>>
>>>>>
>>>>>
>>>>> I would suggest using Cassandra stress as a baseline and go from there
>>>>> depending on what those numbers say (just pick the defaults).
>>>>>
>>>>> Sent from my iPhone
>>>>>
>>>>>
>>>>> On Jul 7, 2016, at 4:39 PM, Yuan Fang <[email protected]> wrote:
>>>>>
>>>>> yes, it is about 8k writes per node.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 7, 2016 at 2:18 PM, daemeon reiydelle <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Are you saying 7k writes per node? or 30k writes per node?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *.......Daemeon C.M. ReiydelleUSA (+1) 415.501.0198
>>>>> <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872
>>>>> <%28%2B44%29%20%280%29%2020%208144%209872>*
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 7, 2016 at 2:05 PM, Yuan Fang <[email protected]>
>>>>> wrote:
>>>>>
>>>>> writes 30k/second is the main thing.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 7, 2016 at 1:51 PM, daemeon reiydelle <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Assuming you meant 100k, that likely for something with 16mb of
>>>>> storage (probably way small) where the data is more that 64k hence will
>>>>> not
>>>>> fit into the row cache.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *.......Daemeon C.M. ReiydelleUSA (+1) 415.501.0198
>>>>> <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872
>>>>> <%28%2B44%29%20%280%29%2020%208144%209872>*
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 7, 2016 at 1:25 PM, Yuan Fang <[email protected]>
>>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> I have a cluster of 4 m4.xlarge nodes(4 cpus and 16 gb memory and
>>>>> 600GB ssd EBS).
>>>>>
>>>>> I can reach a cluster wide write requests of 30k/second and read
>>>>> request about 100/second. The cluster OS load constantly above 10. Are
>>>>> those normal?
>>>>>
>>>>>
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Best,
>>>>>
>>>>>
>>>>>
>>>>> Yuan
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> ————————
>>>>>
>>>>> Ben Slater
>>>>>
>>>>> Chief Product Officer
>>>>>
>>>>> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
>>>>>
>>>>> +61 437 929 798
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>