Re: New node has high network and disk usage.

Kai Wang Thu, 14 Jan 2016 06:22:45 -0800

James,

Can you post the result of "nodetool netstats" on the bad node?


On Thu, Jan 14, 2016 at 9:09 AM, James Griffin <
james.grif...@idioplatform.com> wrote:

> A summary of what we've done this morning:
>
>    - Noted that there are no GCInspector lines in system.log on bad node
>    (there are GCInspector logs on other healthy nodes)
>    - Turned on GC logging, noted that we had logs which stated out total
>    time for which application threads were stopped was high - ~10s.
>    - Not seeing failures or any kind (promotion or concurrent mark)
>    - Attached Visual VM: noted that heap usage was very low (~5% usage
>    and stable) and it didn't display hallmarks GC of activity. PermGen also
>    very stable
>    - Downloaded GC logs and examined in GC Viewer. Noted that:
>    - We had lots of pauses (again around 10s), but no full GC.
>       - From a 2,300s sample, just over 2,000s were spent with threads
>       paused
>       - Spotted many small GCs in the new space - realised that Xmn value
>       was very low (200M against a heap size of 3750M). Increased Xmn to 937M 
> -
>       no change in server behaviour (high load, high reads/s on disk, high CPU
>       wait)
>
> Current output of jstat:
>
>   S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT
> 2 0.00  45.20  12.82  26.84  76.21   2333   63.684     2    0.039   63.724
> 3 63.58   0.00  33.68   8.04  75.19     14    1.812     2    0.103    1.915
>
> Correct me if I'm wrong, but it seems 3 is lot more healthy GC wise than 2
> (which has normal load statistics).
>
> Anywhere else you can recommend we look?
>
> Griff
>
> On 14 January 2016 at 01:25, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote:
>
>> Ok. I saw dropped mutations on your cluster and full gc is a common cause
>> for that.
>> Can you just search the word GCInspector in system.log and share the
>> frequency of minor and full gc. Moreover, are you printing promotion
>> failures in gc logs?? Why full gc ia getting triggered??promotion failures
>> or concurrent mode failures?
>>
>> If you are on CMS, you need to fine tune your heap options to address
>> full gc.
>>
>>
>>
>> Thanks
>> Anuj
>>
>> Sent from Yahoo Mail on Android
>> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>>
>> On Thu, 14 Jan, 2016 at 12:57 am, James Griffin
>> <james.grif...@idioplatform.com> wrote:
>> I think I was incorrect in assuming GC wasn't an issue due to the lack of
>> logs. Comparing jstat output on nodes 2 & 3 show some fairly marked
>> differences, though
>> comparing the startup flags on the two machines show the GC config is
>> identical.:
>>
>> $ jstat -gcutil
>>    S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT
>> 2  5.08   0.00  55.72  18.24  59.90  25986  619.827    28    1.597
>>  621.424
>> 3  0.00   0.00  22.79  17.87  59.99 422600 11225.979   668   57.383
>> 11283.361
>>
>> Here's typical output for iostat on nodes 2 & 3 as well:
>>
>> $ iostat -dmx md0
>>
>>   Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s
>> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>> 2 md0               0.00     0.00  339.00    0.00     9.77     0.00
>>  59.00     0.00    0.00    0.00    0.00   0.00   0.00
>> 3 md0               0.00     0.00 2069.00    1.00    85.85     0.00
>>  84.94     0.00    0.00    0.00    0.00   0.00   0.00
>>
>> Griff
>>
>> On 13 January 2016 at 18:36, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote:
>>
>>> Node 2 has slightly higher data but that should be ok. Not sure how read
>>> ops are so high when no IO intensive activity such as repair and compaction
>>> is running on node 3.May be you can try investigating logs to see whats
>>> happening.
>>>
>>> Others on the mailing list could also share their views on the situation.
>>>
>>> Thanks
>>> Anuj
>>>
>>>
>>>
>>> Sent from Yahoo Mail on Android
>>> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>>>
>>> On Wed, 13 Jan, 2016 at 11:46 pm, James Griffin
>>> <james.grif...@idioplatform.com> wrote:
>>> Hi Anuj,
>>>
>>> Below is the output of nodetool status. The nodes were replaced
>>> following the instructions in Datastax documentation for replacing running
>>> nodes since the nodes were running fine, it was that the servers had been
>>> incorrectly initialised and they thus had less disk space. The status below
>>> shows 2 has significantly higher load, however as I say 2 is operating
>>> normally and is running compactions, so I guess that's not an issue?
>>>
>>> Datacenter: datacenter1
>>> =======================
>>> Status=Up/Down
>>> |/ State=Normal/Leaving/Joining/Moving
>>> --  Address         Load       Tokens  Owns   Host ID
>>>             Rack
>>> UN  1               253.59 GB  256     31.7%
>>>  6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1
>>> UN  2               302.23 GB  256     35.3%
>>>  faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1
>>> UN  3               265.02 GB  256     33.1%
>>>  74b15507-db5c-45df-81db-6e5bcb7438a3  rack1
>>>
>>> Griff
>>>
>>> On 13 January 2016 at 18:12, Anuj Wadehra <anujw_2...@yahoo.co.in>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Revisiting the thread I can see that nodetool status had both good and
>>>> bad nodes at same time. How do you replace nodes? When you say bad node..I
>>>> understand that the node is no more usable even though Cassandra is UP? Is
>>>> that correct?
>>>>
>>>> If a node is in bad shape and not working, adding new node may trigger
>>>> streaming huge data from bad node too. Have you considered using the
>>>> procedure for replacing a dead node?
>>>>
>>>> Please share Latest nodetool status.
>>>>
>>>> nodetool output shared earlier:
>>>>
>>>>  `nodetool status` output:
>>>>
>>>>     Status=Up/Down
>>>>     |/ State=Normal/Leaving/Joining/Moving
>>>>     --  Address         Load       Tokens  Owns   Host
>>>> ID                               Rack
>>>>     UN  A (Good)        252.37 GB  256     23.0%
>>>> 9cd2e58c-a062-48a4-8d3f-b7bd9ee0576f  rack1
>>>>     UN  B (Good)        245.91 GB  256     24.4%
>>>> 6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1
>>>>     UN  C (Good)        254.79 GB  256     23.7%
>>>> f4891729-9179-4f19-ab2c-50d387da7ac6  rack1
>>>>     UN  D (Bad)         163.85 GB  256     28.8%
>>>> faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1
>>>>
>>>>
>>>>
>>>> Thanks
>>>> Anuj
>>>>
>>>> Sent from Yahoo Mail on Android
>>>> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>>>>
>>>> On Wed, 13 Jan, 2016 at 10:34 pm, James Griffin
>>>> <james.grif...@idioplatform.com> wrote:
>>>> Hi all,
>>>>
>>>> We’ve spent a few days running things but are in the same position. To
>>>> add some more flavour:
>>>>
>>>>
>>>>    - We have a 3-node ring, replication factor = 3. We’ve been running
>>>>    in this configuration for a few years without any real issues
>>>>    - Nodes 2 & 3 are much newer than node 1. These two nodes were
>>>>    brought in to replace two other nodes which had failed RAID0 
>>>> configuration
>>>>    and thus were lacking in disk space.
>>>>    - When node 2 was brought into the ring, it exhibited high CPU
>>>>    wait, IO and load metrics
>>>>    - We subsequently brought 3 into the ring: as soon as 3 was fully
>>>>    bootstrapped, the load, CPU wait and IO stats on 2 dropped to normal
>>>>    levels. Those same stats on 3, however, sky-rocketed
>>>>    - We’ve confirmed configuration across all three nodes are
>>>>    identical and in line with the recommended production settings
>>>>    - We’ve run a full repair
>>>>    - Node 2 is currently running compactions, 1 & 3 aren’t and have no
>>>>    pending
>>>>    - There is no GC happening from what I can see. Node 1 has a GC
>>>>    log, but that’s not been written to since May last year
>>>>
>>>>
>>>> What we’re seeing at the moment is similar and normal stats on nodes 1
>>>> & 2, but high CPU wait, IO and load stats on 3. As a snapshot:
>>>>
>>>>
>>>>    1. Load: 3.96, CPU wait: 30.8%, Disk Read Ops: 408/s
>>>>    2. Load: 5.88, CPU wait: 14.6%, Disk Read Ops: 275/s
>>>>    3. Load: 58.15, CPU wait: 87.0%, Disk Read Ops: 2,408/s
>>>>
>>>>
>>>> Can you recommend any next steps?
>>>>
>>>> Griff
>>>>
>>>> On 6 January 2016 at 17:31, Anuj Wadehra <anujw_2...@yahoo.co.in>
>>>> wrote:
>>>>
>>>>> Hi Vickrum,
>>>>>
>>>>> I would have proceeded with diagnosis as follows:
>>>>>
>>>>> 1. Analysis of sar report to check system health -cpu memory swap
>>>>> disk etc.
>>>>> System seems to be overloaded. This is evident from mutation drops.
>>>>>
>>>>> 2. Make sure that  all recommended Cassandra production settings
>>>>> available at Datastax site are applied ,disable zone reclaim and THP.
>>>>>
>>>>> 3.Run full Repair on bad node and check data size. Node is owner of
>>>>> maximum token range but has significant lower data.I doubt that
>>>>> bootstrapping happened properly.
>>>>>
>>>>> 4.Compactionstats shows 22 pending compactions. Try throttling
>>>>> compactions via reducing cincurent compactors or compaction throughput.
>>>>>
>>>>> 5.Analyze logs to make sure bootstrapping happened without errors.
>>>>>
>>>>> 6. Look for other common performance problems such as GC pauses to
>>>>> make sure that dropped mutations are not caused by GC pauses.
>>>>>
>>>>>
>>>>> Thanks
>>>>> Anuj
>>>>>
>>>>> Sent from Yahoo Mail on Android
>>>>> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>>>>>
>>>>> On Wed, 6 Jan, 2016 at 10:12 pm, Vickrum Loi
>>>>> <vickrum....@idioplatform.com> wrote:
>>>>> # nodetool compactionstats
>>>>> pending tasks: 22
>>>>>           compaction type        keyspace           table
>>>>> completed           total      unit  progress
>>>>>                Compactionproduction_analytics    interactions
>>>>> 240410213    161172668724     bytes     0.15%
>>>>>
>>>>> Compactionproduction_decisionsdecisions.decisions_q_idx
>>>>> 120815385       226295183     bytes    53.39%
>>>>> Active compaction remaining time :   2h39m58s
>>>>>
>>>>> Worth mentioning that compactions haven't been running on this node
>>>>> particularly often. The node's been performing badly regardless of whether
>>>>> it's compacting or not.
>>>>>
>>>>> On 6 January 2016 at 16:35, Jeff Ferland <j...@tubularlabs.com> wrote:
>>>>>
>>>>>> What’s your output of `nodetool compactionstats`?
>>>>>>
>>>>>> On Jan 6, 2016, at 7:26 AM, Vickrum Loi <vickrum....@idioplatform.com>
>>>>>> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> We recently added a new node to our cluster in order to replace a
>>>>>> node that died (hardware failure we believe). For the next two weeks it 
>>>>>> had
>>>>>> high disk and network activity. We replaced the server, but it's happened
>>>>>> again. We've looked into memory allowances, disk performance, number of
>>>>>> connections, and all the nodetool stats, but can't find the cause of the
>>>>>> issue.
>>>>>>
>>>>>> `nodetool tpstats`[0] shows a lot of active and pending threads, in
>>>>>> comparison to the rest of the cluster, but that's likely a symptom, not a
>>>>>> cause.
>>>>>>
>>>>>> `nodetool status`[1] shows the cluster isn't quite balanced. The bad
>>>>>> node (D) has less data.
>>>>>>
>>>>>> Disk Activity[2] and Network activity[3] on this node is far higher
>>>>>> than the rest.
>>>>>>
>>>>>> The only other difference this node has to the rest of the cluster is
>>>>>> that its on the ext4 filesystem, whereas the rest are ext3, but we've 
>>>>>> done
>>>>>> plenty of testing there and can't see how that would affect performance 
>>>>>> on
>>>>>> this node so much.
>>>>>>
>>>>>> Nothing of note in system.log.
>>>>>>
>>>>>> What should our next step be in trying to diagnose this issue?
>>>>>>
>>>>>> Best wishes,
>>>>>> Vic
>>>>>>
>>>>>> [0] `nodetool tpstats` output:
>>>>>>
>>>>>> Good node:
>>>>>>     Pool Name                    Active   Pending      Completed
>>>>>> Blocked  All time blocked
>>>>>>     ReadStage                         0         0
>>>>>> 46311521         0                 0
>>>>>>     RequestResponseStage              0         0
>>>>>> 23817366         0                 0
>>>>>>     MutationStage                     0         0
>>>>>> 47389269         0                 0
>>>>>>     ReadRepairStage                   0         0
>>>>>> 11108         0                 0
>>>>>>     ReplicateOnWriteStage             0         0
>>>>>> 0         0                 0
>>>>>>     GossipStage                       0         0
>>>>>> 5259908         0                 0
>>>>>>     CacheCleanupExecutor              0         0
>>>>>> 0         0                 0
>>>>>>     MigrationStage                    0         0
>>>>>> 30         0                 0
>>>>>>     MemoryMeter                       0         0
>>>>>> 16563         0                 0
>>>>>>     FlushWriter                       0         0
>>>>>> 39637         0                26
>>>>>>     ValidationExecutor                0         0
>>>>>> 19013         0                 0
>>>>>>     InternalResponseStage             0         0
>>>>>> 9         0                 0
>>>>>>     AntiEntropyStage                  0         0
>>>>>> 38026         0                 0
>>>>>>     MemtablePostFlusher               0         0
>>>>>> 81740         0                 0
>>>>>>     MiscStage                         0         0
>>>>>> 19196         0                 0
>>>>>>     PendingRangeCalculator            0         0
>>>>>> 23         0                 0
>>>>>>     CompactionExecutor                0         0
>>>>>> 61629         0                 0
>>>>>>     commitlog_archiver                0         0
>>>>>> 0         0                 0
>>>>>>     HintedHandoff                     0         0
>>>>>> 63         0                 0
>>>>>>
>>>>>>     Message type           Dropped
>>>>>>     RANGE_SLICE                  0
>>>>>>     READ_REPAIR                  0
>>>>>>     PAGED_RANGE                  0
>>>>>>     BINARY                       0
>>>>>>     READ                       640
>>>>>>     MUTATION                     0
>>>>>>     _TRACE                       0
>>>>>>     REQUEST_RESPONSE             0
>>>>>>     COUNTER_MUTATION             0
>>>>>>
>>>>>> Bad node:
>>>>>>     Pool Name                    Active   Pending      Completed
>>>>>> Blocked  All time blocked
>>>>>>     ReadStage                        32       113
>>>>>> 52216         0                 0
>>>>>>     RequestResponseStage              0         0
>>>>>> 4167         0                 0
>>>>>>     MutationStage                     0         0
>>>>>> 127559         0                 0
>>>>>>     ReadRepairStage                   0         0
>>>>>> 125         0                 0
>>>>>>     ReplicateOnWriteStage             0         0
>>>>>> 0         0                 0
>>>>>>     GossipStage                       0         0
>>>>>> 9965         0                 0
>>>>>>     CacheCleanupExecutor              0         0
>>>>>> 0         0                 0
>>>>>>     MigrationStage                    0         0
>>>>>> 0         0                 0
>>>>>>     MemoryMeter                       0         0
>>>>>> 24         0                 0
>>>>>>     FlushWriter                       0         0
>>>>>> 27         0                 1
>>>>>>     ValidationExecutor                0         0
>>>>>> 0         0                 0
>>>>>>     InternalResponseStage             0         0
>>>>>> 0         0                 0
>>>>>>     AntiEntropyStage                  0         0
>>>>>> 0         0                 0
>>>>>>     MemtablePostFlusher               0         0
>>>>>> 96         0                 0
>>>>>>     MiscStage                         0         0
>>>>>> 0         0                 0
>>>>>>     PendingRangeCalculator            0         0
>>>>>> 10         0                 0
>>>>>>     CompactionExecutor                1         1
>>>>>> 73         0                 0
>>>>>>     commitlog_archiver                0         0
>>>>>> 0         0                 0
>>>>>>     HintedHandoff                     0         0
>>>>>> 15         0                 0
>>>>>>
>>>>>>     Message type           Dropped
>>>>>>     RANGE_SLICE                130
>>>>>>     READ_REPAIR                  1
>>>>>>     PAGED_RANGE                  0
>>>>>>     BINARY                       0
>>>>>>     READ                     31032
>>>>>>     MUTATION                   865
>>>>>>     _TRACE                       0
>>>>>>     REQUEST_RESPONSE             7
>>>>>>     COUNTER_MUTATION             0
>>>>>>
>>>>>>
>>>>>> [1] `nodetool status` output:
>>>>>>
>>>>>>     Status=Up/Down
>>>>>>     |/ State=Normal/Leaving/Joining/Moving
>>>>>>     --  Address         Load       Tokens  Owns   Host
>>>>>> ID                               Rack
>>>>>>     UN  A (Good)        252.37 GB  256     23.0%
>>>>>> 9cd2e58c-a062-48a4-8d3f-b7bd9ee0576f  rack1
>>>>>>     UN  B (Good)        245.91 GB  256     24.4%
>>>>>> 6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1
>>>>>>     UN  C (Good)        254.79 GB  256     23.7%
>>>>>> f4891729-9179-4f19-ab2c-50d387da7ac6  rack1
>>>>>>     UN  D (Bad)         163.85 GB  256     28.8%
>>>>>> faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1
>>>>>>
>>>>>> [2] Disk read/write ops:
>>>>>>
>>>>>>
>>>>>> https://s3-eu-west-1.amazonaws.com/uploads-eu.hipchat.com/28299/178477/dRs4jV1ukMeFHGE/cass-disk-read-ops.png
>>>>>>
>>>>>> https://s3-eu-west-1.amazonaws.com/uploads-eu.hipchat.com/28299/178477/gbE58N2WosiOomF/cass-disk-write-ops.png
>>>>>>
>>>>>> [3] Network in/out:
>>>>>>
>>>>>>
>>>>>> https://s3-eu-west-1.amazonaws.com/uploads-eu.hipchat.com/28299/178477/RwOVdUBxu6fPLgF/cass-network-in.png
>>>>>>
>>>>>> https://s3-eu-west-1.amazonaws.com/uploads-eu.hipchat.com/28299/178477/OpZM6ypNVN0O30q/cass-network-out.png
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: New node has high network and disk usage.

Reply via email to