Re: Is my cluster normal?

Riccardo Ferrari Tue, 12 Jul 2016 15:02:21 -0700

While I'm surprised you don't have any dropped message I have to point the
finger against the following tpstats line:


Native-Transport-Requests       128       128      347508531         2
     15891862

where the the first '128' are the active reuests and the second '128' are
the pending ones. Might not be strictly related, however this might be of
interest:

https://issues.apache.org/jira/browse/CASSANDRA-11363

there's a chance that tuning the 'native_transport_*' related options can
mitigate/solve the issue.

Best,

On Tue, Jul 12, 2016 at 9:34 PM, Jonathan Haddad <j...@jonhaddad.com> wrote:

> When you have high system load it means your CPU is waiting for
> *something*, and in my experience it's usually slow disk.  A disk connected
> over network has been a culprit for me many times.
>
> On Tue, Jul 12, 2016 at 12:33 PM Jonathan Haddad <j...@jonhaddad.com>
> wrote:
>
>> Can do you do:
>>
>> iostat -dmx 2 10
>>
>>
>>
>> On Tue, Jul 12, 2016 at 11:20 AM Yuan Fang <y...@kryptoncloud.com> wrote:
>>
>>> Hi Jeff,
>>>
>>> The read being low is because we do not have much read operations right
>>> now.
>>>
>>> The heap is only 4GB.
>>>
>>> MAX_HEAP_SIZE=4GB
>>>
>>> On Thu, Jul 7, 2016 at 7:17 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
>>> wrote:
>>>
>>>> EBS iops scale with volume size.
>>>>
>>>>
>>>>
>>>> A 600G EBS volume only guarantees 1800 iops – if you’re exhausting
>>>> those on writes, you’re going to suffer on reads.
>>>>
>>>>
>>>>
>>>> You have a 16G server, and probably a good chunk of that allocated to
>>>> heap. Consequently, you have almost no page cache, so your reads are going
>>>> to hit the disk. Your reads being very low is not uncommon if you have no
>>>> page cache – the default settings for Cassandra (64k compression chunks)
>>>> are really inefficient for small reads served off of disk. If you drop the
>>>> compression chunk size (4k, for example), you’ll probably see your read
>>>> throughput increase significantly, which will give you more iops for
>>>> commitlog, so write throughput likely goes up, too.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *From: *Jonathan Haddad <j...@jonhaddad.com>
>>>> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
>>>> *Date: *Thursday, July 7, 2016 at 6:54 PM
>>>> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
>>>> *Subject: *Re: Is my cluster normal?
>>>>
>>>>
>>>>
>>>> What's your CPU looking like? If it's low, check your IO with iostat or
>>>> dstat. I know some people have used Ebs and say it's fine but ive been
>>>> burned too many times.
>>>>
>>>> On Thu, Jul 7, 2016 at 6:12 PM Yuan Fang <y...@kryptoncloud.com> wrote:
>>>>
>>>> Hi Riccardo,
>>>>
>>>>
>>>>
>>>> Very low IO-wait. About 0.3%.
>>>>
>>>> No stolen CPU. It is a casssandra only instance. I did not see any
>>>> dropped messages.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ubuntu@cassandra1:/mnt/data$ nodetool tpstats
>>>>
>>>> Pool Name                    Active   Pending      Completed   Blocked
>>>>  All time blocked
>>>>
>>>> MutationStage                     1         1      929509244         0
>>>>                 0
>>>>
>>>> ViewMutationStage                 0         0              0         0
>>>>                 0
>>>>
>>>> ReadStage                         4         0        4021570         0
>>>>                 0
>>>>
>>>> RequestResponseStage              0         0      731477999         0
>>>>                 0
>>>>
>>>> ReadRepairStage                   0         0         165603         0
>>>>                 0
>>>>
>>>> CounterMutationStage              0         0              0         0
>>>>                 0
>>>>
>>>> MiscStage                         0         0              0         0
>>>>                 0
>>>>
>>>> CompactionExecutor                2        55          92022         0
>>>>                 0
>>>>
>>>> MemtableReclaimMemory             0         0           1736         0
>>>>                 0
>>>>
>>>> PendingRangeCalculator            0         0              6         0
>>>>                 0
>>>>
>>>> GossipStage                       0         0         345474         0
>>>>                 0
>>>>
>>>> SecondaryIndexManagement          0         0              0         0
>>>>                 0
>>>>
>>>> HintsDispatcher                   0         0              4         0
>>>>                 0
>>>>
>>>> MigrationStage                    0         0             35         0
>>>>                 0
>>>>
>>>> MemtablePostFlush                 0         0           1973         0
>>>>                 0
>>>>
>>>> ValidationExecutor                0         0              0         0
>>>>                 0
>>>>
>>>> Sampler                           0         0              0         0
>>>>                 0
>>>>
>>>> MemtableFlushWriter               0         0           1736         0
>>>>                 0
>>>>
>>>> InternalResponseStage             0         0           5311         0
>>>>                 0
>>>>
>>>> AntiEntropyStage                  0         0              0         0
>>>>                 0
>>>>
>>>> CacheCleanupExecutor              0         0              0         0
>>>>                 0
>>>>
>>>> Native-Transport-Requests       128       128      347508531         2
>>>>          15891862
>>>>
>>>>
>>>>
>>>> Message type           Dropped
>>>>
>>>> READ                         0
>>>>
>>>> RANGE_SLICE                  0
>>>>
>>>> _TRACE                       0
>>>>
>>>> HINT                         0
>>>>
>>>> MUTATION                     0
>>>>
>>>> COUNTER_MUTATION             0
>>>>
>>>> BATCH_STORE                  0
>>>>
>>>> BATCH_REMOVE                 0
>>>>
>>>> REQUEST_RESPONSE             0
>>>>
>>>> PAGED_RANGE                  0
>>>>
>>>> READ_REPAIR                  0
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Jul 7, 2016 at 5:24 PM, Riccardo Ferrari <ferra...@gmail.com>
>>>> wrote:
>>>>
>>>> Hi Yuan,
>>>>
>>>>
>>>>
>>>> You machine instance is 4 vcpus that is 4 threads (not cores!!!), aside
>>>> from any Cassandra specific discussion a system load of 10 on a 4 threads
>>>> machine is way too much in my opinion. If that is the running average
>>>> system load I would look deeper into system details. Is that IO wait? Is
>>>> that CPU Stolen? Is that a Cassandra only instance or are there other
>>>> processes pushing the load?
>>>>
>>>> What does your "nodetool tpstats" say? Hoe many dropped messages do you
>>>> have?
>>>>
>>>>
>>>>
>>>> Best,
>>>>
>>>>
>>>>
>>>> On Fri, Jul 8, 2016 at 12:34 AM, Yuan Fang <y...@kryptoncloud.com>
>>>> wrote:
>>>>
>>>> Thanks Ben! For the post, it seems they got a little better but similar
>>>> result than i did. Good to know it.
>>>>
>>>> I am not sure if a little fine tuning of heap memory will help or not.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Jul 7, 2016 at 2:58 PM, Ben Slater <ben.sla...@instaclustr.com>
>>>> wrote:
>>>>
>>>> Hi Yuan,
>>>>
>>>>
>>>>
>>>> You might find this blog post a useful comparison:
>>>>
>>>>
>>>> https://www.instaclustr.com/blog/2016/01/07/multi-data-center-apache-spark-and-apache-cassandra-benchmark/
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.instaclustr.com_blog_2016_01_07_multi-2Ddata-2Dcenter-2Dapache-2Dspark-2Dand-2Dapache-2Dcassandra-2Dbenchmark_&d=CwMFaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=Ltg5YUTZbI4Ixf7UjzKW636Llz6zXXurTveCLptZwio&s=MU4-NWBjvVO95HnxQtkYk4xkApq4X4IiVy8tPCgj4KU&e=>
>>>>
>>>>
>>>>
>>>> Although the focus is on Spark and Cassandra and multi-DC there are
>>>> also some single DC benchmarks of m4.xl
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__m4.xl&d=CwQFaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=Ltg5YUTZbI4Ixf7UjzKW636Llz6zXXurTveCLptZwio&s=m3DfZk3YOaf0W2OvACsqDWXp-vdlkP-cC0WnEouZwkk&e=>
>>>> clusters plus some discussion of how we went about benchmarking.
>>>>
>>>>
>>>>
>>>> Cheers
>>>>
>>>> Ben
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, 8 Jul 2016 at 07:52 Yuan Fang <y...@kryptoncloud.com> wrote:
>>>>
>>>> Yes, here is my stress test result:
>>>>
>>>> Results:
>>>>
>>>> op rate                   : 12200 [WRITE:12200]
>>>>
>>>> partition rate            : 12200 [WRITE:12200]
>>>>
>>>> row rate                  : 12200 [WRITE:12200]
>>>>
>>>> latency mean              : 16.4 [WRITE:16.4]
>>>>
>>>> latency median            : 7.1 [WRITE:7.1]
>>>>
>>>> latency 95th percentile   : 38.1 [WRITE:38.1]
>>>>
>>>> latency 99th percentile   : 204.3 [WRITE:204.3]
>>>>
>>>> latency 99.9th percentile : 465.9 [WRITE:465.9]
>>>>
>>>> latency max               : 1408.4 [WRITE:1408.4]
>>>>
>>>> Total partitions          : 1000000 [WRITE:1000000]
>>>>
>>>> Total errors              : 0 [WRITE:0]
>>>>
>>>> total gc count            : 0
>>>>
>>>> total gc mb               : 0
>>>>
>>>> total gc time (s)         : 0
>>>>
>>>> avg gc time(ms)           : NaN
>>>>
>>>> stdev gc time(ms)         : 0
>>>>
>>>> Total operation time      : 00:01:21
>>>>
>>>> END
>>>>
>>>>
>>>>
>>>> On Thu, Jul 7, 2016 at 2:49 PM, Ryan Svihla <r...@foundev.pro> wrote:
>>>>
>>>> Lots of variables you're leaving out.
>>>>
>>>>
>>>>
>>>> Depends on write size, if you're using logged batch or not, what
>>>> consistency level, what RF, if the writes come in bursts, etc, etc.
>>>> However, that's all sort of moot for determining "normal" really you need a
>>>> baseline as all those variables end up mattering a huge amount.
>>>>
>>>>
>>>>
>>>> I would suggest using Cassandra stress as a baseline and go from there
>>>> depending on what those numbers say (just pick the defaults).
>>>>
>>>> Sent from my iPhone
>>>>
>>>>
>>>> On Jul 7, 2016, at 4:39 PM, Yuan Fang <y...@kryptoncloud.com> wrote:
>>>>
>>>> yes, it is about 8k writes per node.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Jul 7, 2016 at 2:18 PM, daemeon reiydelle <daeme...@gmail.com>
>>>> wrote:
>>>>
>>>> Are you saying 7k writes per node? or 30k writes per node?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *.......Daemeon C.M. ReiydelleUSA (+1) 415.501.0198
>>>> <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872
>>>> <%28%2B44%29%20%280%29%2020%208144%209872>*
>>>>
>>>>
>>>>
>>>> On Thu, Jul 7, 2016 at 2:05 PM, Yuan Fang <y...@kryptoncloud.com>
>>>> wrote:
>>>>
>>>> writes 30k/second is the main thing.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Jul 7, 2016 at 1:51 PM, daemeon reiydelle <daeme...@gmail.com>
>>>> wrote:
>>>>
>>>> Assuming you meant 100k, that likely for something with 16mb of storage
>>>> (probably way small) where the data is more that 64k hence will not fit
>>>> into the row cache.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *.......Daemeon C.M. ReiydelleUSA (+1) 415.501.0198
>>>> <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872
>>>> <%28%2B44%29%20%280%29%2020%208144%209872>*
>>>>
>>>>
>>>>
>>>> On Thu, Jul 7, 2016 at 1:25 PM, Yuan Fang <y...@kryptoncloud.com>
>>>> wrote:
>>>>
>>>>
>>>>
>>>> I have a cluster of 4 m4.xlarge nodes(4 cpus and 16 gb memory and 600GB
>>>> ssd EBS).
>>>>
>>>> I can reach a cluster wide write requests of 30k/second and read
>>>> request about 100/second. The cluster OS load constantly above 10. Are
>>>> those normal?
>>>>
>>>>
>>>>
>>>> Thanks!
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Best,
>>>>
>>>>
>>>>
>>>> Yuan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ————————
>>>>
>>>> Ben Slater
>>>>
>>>> Chief Product Officer
>>>>
>>>> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
>>>>
>>>> +61 437 929 798
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>

Re: Is my cluster normal?

Reply via email to