Re: New node has high network and disk usage.

James Griffin Wed, 13 Jan 2016 11:28:16 -0800

I think I was incorrect in assuming GC wasn't an issue due to the lack of
logs. Comparing jstat output on nodes 2 & 3 show some fairly marked
differences, though
comparing the startup flags on the two machines show the GC config is
identical.:


$ jstat -gcutil
   S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT
2  5.08   0.00  55.72  18.24  59.90  25986  619.827    28    1.597  621.424
3  0.00   0.00  22.79  17.87  59.99 422600 11225.979   668   57.383
11283.361

Here's typical output for iostat on nodes 2 & 3 as well:

$ iostat -dmx md0

  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
2 md0               0.00     0.00  339.00    0.00     9.77     0.00
 59.00     0.00    0.00    0.00    0.00   0.00   0.00
3 md0               0.00     0.00 2069.00    1.00    85.85     0.00
 84.94     0.00    0.00    0.00    0.00   0.00   0.00

Griff

On 13 January 2016 at 18:36, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote:

> Node 2 has slightly higher data but that should be ok. Not sure how read
> ops are so high when no IO intensive activity such as repair and compaction
> is running on node 3.May be you can try investigating logs to see whats
> happening.
>
> Others on the mailing list could also share their views on the situation.
>
> Thanks
> Anuj
>
>
>
> Sent from Yahoo Mail on Android
> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>
> On Wed, 13 Jan, 2016 at 11:46 pm, James Griffin
> <james.grif...@idioplatform.com> wrote:
> Hi Anuj,
>
> Below is the output of nodetool status. The nodes were replaced following
> the instructions in Datastax documentation for replacing running nodes
> since the nodes were running fine, it was that the servers had been
> incorrectly initialised and they thus had less disk space. The status below
> shows 2 has significantly higher load, however as I say 2 is operating
> normally and is running compactions, so I guess that's not an issue?
>
> Datacenter: datacenter1
> =======================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address         Load       Tokens  Owns   Host ID
>           Rack
> UN  1               253.59 GB  256     31.7%
>  6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1
> UN  2               302.23 GB  256     35.3%
>  faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1
> UN  3               265.02 GB  256     33.1%
>  74b15507-db5c-45df-81db-6e5bcb7438a3  rack1
>
> Griff
>
> On 13 January 2016 at 18:12, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote:
>
>> Hi,
>>
>> Revisiting the thread I can see that nodetool status had both good and
>> bad nodes at same time. How do you replace nodes? When you say bad node..I
>> understand that the node is no more usable even though Cassandra is UP? Is
>> that correct?
>>
>> If a node is in bad shape and not working, adding new node may trigger
>> streaming huge data from bad node too. Have you considered using the
>> procedure for replacing a dead node?
>>
>> Please share Latest nodetool status.
>>
>> nodetool output shared earlier:
>>
>>  `nodetool status` output:
>>
>>     Status=Up/Down
>>     |/ State=Normal/Leaving/Joining/Moving
>>     --  Address         Load       Tokens  Owns   Host
>> ID                               Rack
>>     UN  A (Good)        252.37 GB  256     23.0%
>> 9cd2e58c-a062-48a4-8d3f-b7bd9ee0576f  rack1
>>     UN  B (Good)        245.91 GB  256     24.4%
>> 6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1
>>     UN  C (Good)        254.79 GB  256     23.7%
>> f4891729-9179-4f19-ab2c-50d387da7ac6  rack1
>>     UN  D (Bad)         163.85 GB  256     28.8%
>> faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1
>>
>>
>>
>> Thanks
>> Anuj
>>
>> Sent from Yahoo Mail on Android
>> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>>
>> On Wed, 13 Jan, 2016 at 10:34 pm, James Griffin
>> <james.grif...@idioplatform.com> wrote:
>> Hi all,
>>
>> We’ve spent a few days running things but are in the same position. To
>> add some more flavour:
>>
>>
>>    - We have a 3-node ring, replication factor = 3. We’ve been running
>>    in this configuration for a few years without any real issues
>>    - Nodes 2 & 3 are much newer than node 1. These two nodes were
>>    brought in to replace two other nodes which had failed RAID0 configuration
>>    and thus were lacking in disk space.
>>    - When node 2 was brought into the ring, it exhibited high CPU wait,
>>    IO and load metrics
>>    - We subsequently brought 3 into the ring: as soon as 3 was fully
>>    bootstrapped, the load, CPU wait and IO stats on 2 dropped to normal
>>    levels. Those same stats on 3, however, sky-rocketed
>>    - We’ve confirmed configuration across all three nodes are identical
>>    and in line with the recommended production settings
>>    - We’ve run a full repair
>>    - Node 2 is currently running compactions, 1 & 3 aren’t and have no
>>    pending
>>    - There is no GC happening from what I can see. Node 1 has a GC log,
>>    but that’s not been written to since May last year
>>
>>
>> What we’re seeing at the moment is similar and normal stats on nodes 1 &
>> 2, but high CPU wait, IO and load stats on 3. As a snapshot:
>>
>>
>>    1. Load: 3.96, CPU wait: 30.8%, Disk Read Ops: 408/s
>>    2. Load: 5.88, CPU wait: 14.6%, Disk Read Ops: 275/s
>>    3. Load: 58.15, CPU wait: 87.0%, Disk Read Ops: 2,408/s
>>
>>
>> Can you recommend any next steps?
>>
>> Griff
>>
>> On 6 January 2016 at 17:31, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote:
>>
>>> Hi Vickrum,
>>>
>>> I would have proceeded with diagnosis as follows:
>>>
>>> 1. Analysis of sar report to check system health -cpu memory swap disk
>>> etc.
>>> System seems to be overloaded. This is evident from mutation drops.
>>>
>>> 2. Make sure that  all recommended Cassandra production settings
>>> available at Datastax site are applied ,disable zone reclaim and THP.
>>>
>>> 3.Run full Repair on bad node and check data size. Node is owner of
>>> maximum token range but has significant lower data.I doubt that
>>> bootstrapping happened properly.
>>>
>>> 4.Compactionstats shows 22 pending compactions. Try throttling
>>> compactions via reducing cincurent compactors or compaction throughput.
>>>
>>> 5.Analyze logs to make sure bootstrapping happened without errors.
>>>
>>> 6. Look for other common performance problems such as GC pauses to make
>>> sure that dropped mutations are not caused by GC pauses.
>>>
>>>
>>> Thanks
>>> Anuj
>>>
>>> Sent from Yahoo Mail on Android
>>> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>>>
>>> On Wed, 6 Jan, 2016 at 10:12 pm, Vickrum Loi
>>> <vickrum....@idioplatform.com> wrote:
>>> # nodetool compactionstats
>>> pending tasks: 22
>>>           compaction type        keyspace           table
>>> completed           total      unit  progress
>>>                Compactionproduction_analytics    interactions
>>> 240410213    161172668724     bytes     0.15%
>>>
>>> Compactionproduction_decisionsdecisions.decisions_q_idx
>>> 120815385       226295183     bytes    53.39%
>>> Active compaction remaining time :   2h39m58s
>>>
>>> Worth mentioning that compactions haven't been running on this node
>>> particularly often. The node's been performing badly regardless of whether
>>> it's compacting or not.
>>>
>>> On 6 January 2016 at 16:35, Jeff Ferland <j...@tubularlabs.com> wrote:
>>>
>>>> What’s your output of `nodetool compactionstats`?
>>>>
>>>> On Jan 6, 2016, at 7:26 AM, Vickrum Loi <vickrum....@idioplatform.com>
>>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> We recently added a new node to our cluster in order to replace a node
>>>> that died (hardware failure we believe). For the next two weeks it had high
>>>> disk and network activity. We replaced the server, but it's happened again.
>>>> We've looked into memory allowances, disk performance, number of
>>>> connections, and all the nodetool stats, but can't find the cause of the
>>>> issue.
>>>>
>>>> `nodetool tpstats`[0] shows a lot of active and pending threads, in
>>>> comparison to the rest of the cluster, but that's likely a symptom, not a
>>>> cause.
>>>>
>>>> `nodetool status`[1] shows the cluster isn't quite balanced. The bad
>>>> node (D) has less data.
>>>>
>>>> Disk Activity[2] and Network activity[3] on this node is far higher
>>>> than the rest.
>>>>
>>>> The only other difference this node has to the rest of the cluster is
>>>> that its on the ext4 filesystem, whereas the rest are ext3, but we've done
>>>> plenty of testing there and can't see how that would affect performance on
>>>> this node so much.
>>>>
>>>> Nothing of note in system.log.
>>>>
>>>> What should our next step be in trying to diagnose this issue?
>>>>
>>>> Best wishes,
>>>> Vic
>>>>
>>>> [0] `nodetool tpstats` output:
>>>>
>>>> Good node:
>>>>     Pool Name                    Active   Pending      Completed
>>>> Blocked  All time blocked
>>>>     ReadStage                         0         0
>>>> 46311521         0                 0
>>>>     RequestResponseStage              0         0
>>>> 23817366         0                 0
>>>>     MutationStage                     0         0
>>>> 47389269         0                 0
>>>>     ReadRepairStage                   0         0
>>>> 11108         0                 0
>>>>     ReplicateOnWriteStage             0         0
>>>> 0         0                 0
>>>>     GossipStage                       0         0
>>>> 5259908         0                 0
>>>>     CacheCleanupExecutor              0         0
>>>> 0         0                 0
>>>>     MigrationStage                    0         0
>>>> 30         0                 0
>>>>     MemoryMeter                       0         0
>>>> 16563         0                 0
>>>>     FlushWriter                       0         0
>>>> 39637         0                26
>>>>     ValidationExecutor                0         0
>>>> 19013         0                 0
>>>>     InternalResponseStage             0         0
>>>> 9         0                 0
>>>>     AntiEntropyStage                  0         0
>>>> 38026         0                 0
>>>>     MemtablePostFlusher               0         0
>>>> 81740         0                 0
>>>>     MiscStage                         0         0
>>>> 19196         0                 0
>>>>     PendingRangeCalculator            0         0
>>>> 23         0                 0
>>>>     CompactionExecutor                0         0
>>>> 61629         0                 0
>>>>     commitlog_archiver                0         0
>>>> 0         0                 0
>>>>     HintedHandoff                     0         0
>>>> 63         0                 0
>>>>
>>>>     Message type           Dropped
>>>>     RANGE_SLICE                  0
>>>>     READ_REPAIR                  0
>>>>     PAGED_RANGE                  0
>>>>     BINARY                       0
>>>>     READ                       640
>>>>     MUTATION                     0
>>>>     _TRACE                       0
>>>>     REQUEST_RESPONSE             0
>>>>     COUNTER_MUTATION             0
>>>>
>>>> Bad node:
>>>>     Pool Name                    Active   Pending      Completed
>>>> Blocked  All time blocked
>>>>     ReadStage                        32       113
>>>> 52216         0                 0
>>>>     RequestResponseStage              0         0
>>>> 4167         0                 0
>>>>     MutationStage                     0         0
>>>> 127559         0                 0
>>>>     ReadRepairStage                   0         0
>>>> 125         0                 0
>>>>     ReplicateOnWriteStage             0         0
>>>> 0         0                 0
>>>>     GossipStage                       0         0
>>>> 9965         0                 0
>>>>     CacheCleanupExecutor              0         0
>>>> 0         0                 0
>>>>     MigrationStage                    0         0
>>>> 0         0                 0
>>>>     MemoryMeter                       0         0
>>>> 24         0                 0
>>>>     FlushWriter                       0         0
>>>> 27         0                 1
>>>>     ValidationExecutor                0         0
>>>> 0         0                 0
>>>>     InternalResponseStage             0         0
>>>> 0         0                 0
>>>>     AntiEntropyStage                  0         0
>>>> 0         0                 0
>>>>     MemtablePostFlusher               0         0
>>>> 96         0                 0
>>>>     MiscStage                         0         0
>>>> 0         0                 0
>>>>     PendingRangeCalculator            0         0
>>>> 10         0                 0
>>>>     CompactionExecutor                1         1
>>>> 73         0                 0
>>>>     commitlog_archiver                0         0
>>>> 0         0                 0
>>>>     HintedHandoff                     0         0
>>>> 15         0                 0
>>>>
>>>>     Message type           Dropped
>>>>     RANGE_SLICE                130
>>>>     READ_REPAIR                  1
>>>>     PAGED_RANGE                  0
>>>>     BINARY                       0
>>>>     READ                     31032
>>>>     MUTATION                   865
>>>>     _TRACE                       0
>>>>     REQUEST_RESPONSE             7
>>>>     COUNTER_MUTATION             0
>>>>
>>>>
>>>> [1] `nodetool status` output:
>>>>
>>>>     Status=Up/Down
>>>>     |/ State=Normal/Leaving/Joining/Moving
>>>>     --  Address         Load       Tokens  Owns   Host
>>>> ID                               Rack
>>>>     UN  A (Good)        252.37 GB  256     23.0%
>>>> 9cd2e58c-a062-48a4-8d3f-b7bd9ee0576f  rack1
>>>>     UN  B (Good)        245.91 GB  256     24.4%
>>>> 6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1
>>>>     UN  C (Good)        254.79 GB  256     23.7%
>>>> f4891729-9179-4f19-ab2c-50d387da7ac6  rack1
>>>>     UN  D (Bad)         163.85 GB  256     28.8%
>>>> faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1
>>>>
>>>> [2] Disk read/write ops:
>>>>
>>>>
>>>> https://s3-eu-west-1.amazonaws.com/uploads-eu.hipchat.com/28299/178477/dRs4jV1ukMeFHGE/cass-disk-read-ops.png
>>>>
>>>> https://s3-eu-west-1.amazonaws.com/uploads-eu.hipchat.com/28299/178477/gbE58N2WosiOomF/cass-disk-write-ops.png
>>>>
>>>> [3] Network in/out:
>>>>
>>>>
>>>> https://s3-eu-west-1.amazonaws.com/uploads-eu.hipchat.com/28299/178477/RwOVdUBxu6fPLgF/cass-network-in.png
>>>>
>>>> https://s3-eu-west-1.amazonaws.com/uploads-eu.hipchat.com/28299/178477/OpZM6ypNVN0O30q/cass-network-out.png
>>>>
>>>>
>>>>
>>>
>>
>

Re: New node has high network and disk usage.

Reply via email to