While I'm surprised you don't have any dropped message I have to point the finger against the following tpstats line:
Native-Transport-Requests 128 128 347508531 2 15891862 where the the first '128' are the active reuests and the second '128' are the pending ones. Might not be strictly related, however this might be of interest: https://issues.apache.org/jira/browse/CASSANDRA-11363 there's a chance that tuning the 'native_transport_*' related options can mitigate/solve the issue. Best, On Tue, Jul 12, 2016 at 9:34 PM, Jonathan Haddad <j...@jonhaddad.com> wrote: > When you have high system load it means your CPU is waiting for > *something*, and in my experience it's usually slow disk. A disk connected > over network has been a culprit for me many times. > > On Tue, Jul 12, 2016 at 12:33 PM Jonathan Haddad <j...@jonhaddad.com> > wrote: > >> Can do you do: >> >> iostat -dmx 2 10 >> >> >> >> On Tue, Jul 12, 2016 at 11:20 AM Yuan Fang <y...@kryptoncloud.com> wrote: >> >>> Hi Jeff, >>> >>> The read being low is because we do not have much read operations right >>> now. >>> >>> The heap is only 4GB. >>> >>> MAX_HEAP_SIZE=4GB >>> >>> On Thu, Jul 7, 2016 at 7:17 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com> >>> wrote: >>> >>>> EBS iops scale with volume size. >>>> >>>> >>>> >>>> A 600G EBS volume only guarantees 1800 iops – if you’re exhausting >>>> those on writes, you’re going to suffer on reads. >>>> >>>> >>>> >>>> You have a 16G server, and probably a good chunk of that allocated to >>>> heap. Consequently, you have almost no page cache, so your reads are going >>>> to hit the disk. Your reads being very low is not uncommon if you have no >>>> page cache – the default settings for Cassandra (64k compression chunks) >>>> are really inefficient for small reads served off of disk. If you drop the >>>> compression chunk size (4k, for example), you’ll probably see your read >>>> throughput increase significantly, which will give you more iops for >>>> commitlog, so write throughput likely goes up, too. >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> *From: *Jonathan Haddad <j...@jonhaddad.com> >>>> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org> >>>> *Date: *Thursday, July 7, 2016 at 6:54 PM >>>> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org> >>>> *Subject: *Re: Is my cluster normal? >>>> >>>> >>>> >>>> What's your CPU looking like? If it's low, check your IO with iostat or >>>> dstat. I know some people have used Ebs and say it's fine but ive been >>>> burned too many times. >>>> >>>> On Thu, Jul 7, 2016 at 6:12 PM Yuan Fang <y...@kryptoncloud.com> wrote: >>>> >>>> Hi Riccardo, >>>> >>>> >>>> >>>> Very low IO-wait. About 0.3%. >>>> >>>> No stolen CPU. It is a casssandra only instance. I did not see any >>>> dropped messages. >>>> >>>> >>>> >>>> >>>> >>>> ubuntu@cassandra1:/mnt/data$ nodetool tpstats >>>> >>>> Pool Name Active Pending Completed Blocked >>>> All time blocked >>>> >>>> MutationStage 1 1 929509244 0 >>>> 0 >>>> >>>> ViewMutationStage 0 0 0 0 >>>> 0 >>>> >>>> ReadStage 4 0 4021570 0 >>>> 0 >>>> >>>> RequestResponseStage 0 0 731477999 0 >>>> 0 >>>> >>>> ReadRepairStage 0 0 165603 0 >>>> 0 >>>> >>>> CounterMutationStage 0 0 0 0 >>>> 0 >>>> >>>> MiscStage 0 0 0 0 >>>> 0 >>>> >>>> CompactionExecutor 2 55 92022 0 >>>> 0 >>>> >>>> MemtableReclaimMemory 0 0 1736 0 >>>> 0 >>>> >>>> PendingRangeCalculator 0 0 6 0 >>>> 0 >>>> >>>> GossipStage 0 0 345474 0 >>>> 0 >>>> >>>> SecondaryIndexManagement 0 0 0 0 >>>> 0 >>>> >>>> HintsDispatcher 0 0 4 0 >>>> 0 >>>> >>>> MigrationStage 0 0 35 0 >>>> 0 >>>> >>>> MemtablePostFlush 0 0 1973 0 >>>> 0 >>>> >>>> ValidationExecutor 0 0 0 0 >>>> 0 >>>> >>>> Sampler 0 0 0 0 >>>> 0 >>>> >>>> MemtableFlushWriter 0 0 1736 0 >>>> 0 >>>> >>>> InternalResponseStage 0 0 5311 0 >>>> 0 >>>> >>>> AntiEntropyStage 0 0 0 0 >>>> 0 >>>> >>>> CacheCleanupExecutor 0 0 0 0 >>>> 0 >>>> >>>> Native-Transport-Requests 128 128 347508531 2 >>>> 15891862 >>>> >>>> >>>> >>>> Message type Dropped >>>> >>>> READ 0 >>>> >>>> RANGE_SLICE 0 >>>> >>>> _TRACE 0 >>>> >>>> HINT 0 >>>> >>>> MUTATION 0 >>>> >>>> COUNTER_MUTATION 0 >>>> >>>> BATCH_STORE 0 >>>> >>>> BATCH_REMOVE 0 >>>> >>>> REQUEST_RESPONSE 0 >>>> >>>> PAGED_RANGE 0 >>>> >>>> READ_REPAIR 0 >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Thu, Jul 7, 2016 at 5:24 PM, Riccardo Ferrari <ferra...@gmail.com> >>>> wrote: >>>> >>>> Hi Yuan, >>>> >>>> >>>> >>>> You machine instance is 4 vcpus that is 4 threads (not cores!!!), aside >>>> from any Cassandra specific discussion a system load of 10 on a 4 threads >>>> machine is way too much in my opinion. If that is the running average >>>> system load I would look deeper into system details. Is that IO wait? Is >>>> that CPU Stolen? Is that a Cassandra only instance or are there other >>>> processes pushing the load? >>>> >>>> What does your "nodetool tpstats" say? Hoe many dropped messages do you >>>> have? >>>> >>>> >>>> >>>> Best, >>>> >>>> >>>> >>>> On Fri, Jul 8, 2016 at 12:34 AM, Yuan Fang <y...@kryptoncloud.com> >>>> wrote: >>>> >>>> Thanks Ben! For the post, it seems they got a little better but similar >>>> result than i did. Good to know it. >>>> >>>> I am not sure if a little fine tuning of heap memory will help or not. >>>> >>>> >>>> >>>> >>>> >>>> On Thu, Jul 7, 2016 at 2:58 PM, Ben Slater <ben.sla...@instaclustr.com> >>>> wrote: >>>> >>>> Hi Yuan, >>>> >>>> >>>> >>>> You might find this blog post a useful comparison: >>>> >>>> >>>> https://www.instaclustr.com/blog/2016/01/07/multi-data-center-apache-spark-and-apache-cassandra-benchmark/ >>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.instaclustr.com_blog_2016_01_07_multi-2Ddata-2Dcenter-2Dapache-2Dspark-2Dand-2Dapache-2Dcassandra-2Dbenchmark_&d=CwMFaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=Ltg5YUTZbI4Ixf7UjzKW636Llz6zXXurTveCLptZwio&s=MU4-NWBjvVO95HnxQtkYk4xkApq4X4IiVy8tPCgj4KU&e=> >>>> >>>> >>>> >>>> Although the focus is on Spark and Cassandra and multi-DC there are >>>> also some single DC benchmarks of m4.xl >>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__m4.xl&d=CwQFaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=Ltg5YUTZbI4Ixf7UjzKW636Llz6zXXurTveCLptZwio&s=m3DfZk3YOaf0W2OvACsqDWXp-vdlkP-cC0WnEouZwkk&e=> >>>> clusters plus some discussion of how we went about benchmarking. >>>> >>>> >>>> >>>> Cheers >>>> >>>> Ben >>>> >>>> >>>> >>>> >>>> >>>> On Fri, 8 Jul 2016 at 07:52 Yuan Fang <y...@kryptoncloud.com> wrote: >>>> >>>> Yes, here is my stress test result: >>>> >>>> Results: >>>> >>>> op rate : 12200 [WRITE:12200] >>>> >>>> partition rate : 12200 [WRITE:12200] >>>> >>>> row rate : 12200 [WRITE:12200] >>>> >>>> latency mean : 16.4 [WRITE:16.4] >>>> >>>> latency median : 7.1 [WRITE:7.1] >>>> >>>> latency 95th percentile : 38.1 [WRITE:38.1] >>>> >>>> latency 99th percentile : 204.3 [WRITE:204.3] >>>> >>>> latency 99.9th percentile : 465.9 [WRITE:465.9] >>>> >>>> latency max : 1408.4 [WRITE:1408.4] >>>> >>>> Total partitions : 1000000 [WRITE:1000000] >>>> >>>> Total errors : 0 [WRITE:0] >>>> >>>> total gc count : 0 >>>> >>>> total gc mb : 0 >>>> >>>> total gc time (s) : 0 >>>> >>>> avg gc time(ms) : NaN >>>> >>>> stdev gc time(ms) : 0 >>>> >>>> Total operation time : 00:01:21 >>>> >>>> END >>>> >>>> >>>> >>>> On Thu, Jul 7, 2016 at 2:49 PM, Ryan Svihla <r...@foundev.pro> wrote: >>>> >>>> Lots of variables you're leaving out. >>>> >>>> >>>> >>>> Depends on write size, if you're using logged batch or not, what >>>> consistency level, what RF, if the writes come in bursts, etc, etc. >>>> However, that's all sort of moot for determining "normal" really you need a >>>> baseline as all those variables end up mattering a huge amount. >>>> >>>> >>>> >>>> I would suggest using Cassandra stress as a baseline and go from there >>>> depending on what those numbers say (just pick the defaults). >>>> >>>> Sent from my iPhone >>>> >>>> >>>> On Jul 7, 2016, at 4:39 PM, Yuan Fang <y...@kryptoncloud.com> wrote: >>>> >>>> yes, it is about 8k writes per node. >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Thu, Jul 7, 2016 at 2:18 PM, daemeon reiydelle <daeme...@gmail.com> >>>> wrote: >>>> >>>> Are you saying 7k writes per node? or 30k writes per node? >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> *.......Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 >>>> <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872 >>>> <%28%2B44%29%20%280%29%2020%208144%209872>* >>>> >>>> >>>> >>>> On Thu, Jul 7, 2016 at 2:05 PM, Yuan Fang <y...@kryptoncloud.com> >>>> wrote: >>>> >>>> writes 30k/second is the main thing. >>>> >>>> >>>> >>>> >>>> >>>> On Thu, Jul 7, 2016 at 1:51 PM, daemeon reiydelle <daeme...@gmail.com> >>>> wrote: >>>> >>>> Assuming you meant 100k, that likely for something with 16mb of storage >>>> (probably way small) where the data is more that 64k hence will not fit >>>> into the row cache. >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> *.......Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 >>>> <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872 >>>> <%28%2B44%29%20%280%29%2020%208144%209872>* >>>> >>>> >>>> >>>> On Thu, Jul 7, 2016 at 1:25 PM, Yuan Fang <y...@kryptoncloud.com> >>>> wrote: >>>> >>>> >>>> >>>> I have a cluster of 4 m4.xlarge nodes(4 cpus and 16 gb memory and 600GB >>>> ssd EBS). >>>> >>>> I can reach a cluster wide write requests of 30k/second and read >>>> request about 100/second. The cluster OS load constantly above 10. Are >>>> those normal? >>>> >>>> >>>> >>>> Thanks! >>>> >>>> >>>> >>>> >>>> >>>> Best, >>>> >>>> >>>> >>>> Yuan >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> ———————— >>>> >>>> Ben Slater >>>> >>>> Chief Product Officer >>>> >>>> Instaclustr: Cassandra + Spark - Managed | Consulting | Support >>>> >>>> +61 437 929 798 >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>