Re: New node has high network and disk usage.

Anuj Wadehra Wed, 13 Jan 2016 17:26:09 -0800

Ok. I saw dropped mutations on your cluster and full gc is a common cause for 
that.Can you just search the word GCInspector in system.log and share the 
frequency of minor and full gc. Moreover, are you printing promotion failures 
in gc logs?? Why full gc ia getting triggered??promotion failures or concurrent 
mode failures?
If you are on CMS, you need to fine tune your heap options to address full gc.



ThanksAnuj
Sent from Yahoo Mail on Android 
 
  On Thu, 14 Jan, 2016 at 12:57 am, James 
Griffin<james.grif...@idioplatform.com> wrote:   I think I was incorrect in 
assuming GC wasn't an issue due to the lack of logs. Comparing jstat output on 
nodes 2 & 3 show some fairly marked differences, though 
comparing the startup flags on the two machines show the GC config is 
identical.:
$ jstat -gcutil   S0     S1     E      O      P     YGC     YGCT    FGC    FGCT 
    GCT2  5.08   0.00  55.72  18.24  59.90  25986  619.827    28    1.597  
621.4243  0.00   0.00  22.79  17.87  59.99 422600 11225.979   668   57.383 
11283.361
Here's typical output for iostat on nodes 2 & 3 as well:
$ iostat -dmx md0
  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
avgqu-sz   await r_await w_await  svctm  %util2 md0               0.00     0.00 
 339.00    0.00     9.77     0.00    59.00     0.00    0.00    0.00    0.00   
0.00   0.003 md0               0.00     0.00 2069.00    1.00    85.85     0.00  
  84.94     0.00    0.00    0.00    0.00   0.00   0.00
Griff

On 13 January 2016 at 18:36, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote:

Node 2 has slightly higher data but that should be ok. Not sure how read ops 
are so high when no IO intensive activity such as repair and compaction is 
running on node 3.May be you can try investigating logs to see whats happening.
Others on the mailing list could also share their views on the situation.

ThanksAnuj


Sent from Yahoo Mail on Android 
 
 On Wed, 13 Jan, 2016 at 11:46 pm, James 
Griffin<james.grif...@idioplatform.com> wrote:   Hi Anuj, 
Below is the output of nodetool status. The nodes were replaced following the 
instructions in Datastax documentation for replacing running nodes since the 
nodes were running fine, it was that the servers had been incorrectly 
initialised and they thus had less disk space. The status below shows 2 has 
significantly higher load, however as I say 2 is operating normally and is 
running compactions, so I guess that's not an issue?
Datacenter: datacenter1=======================Status=Up/Down|/ 
State=Normal/Leaving/Joining/Moving--  Address         Load       Tokens  Owns  
 Host ID                               RackUN  1               253.59 GB  256   
  31.7%  6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1UN  2               302.23 
GB  256     35.3%  faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1UN  3             
  265.02 GB  256     33.1%  74b15507-db5c-45df-81db-6e5bcb7438a3  rack1
Griff

On 13 January 2016 at 18:12, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote:

Hi,
Revisiting the thread I can see that nodetool status had both good and bad 
nodes at same time. How do you replace nodes? When you say bad node..I 
understand that the node is no more usable even though Cassandra is UP? Is that 
correct?
If a node is in bad shape and not working, adding new node may trigger 
streaming huge data from bad node too. Have you considered using the procedure 
for replacing a dead node?
Please share Latest nodetool status.
nodetool output shared earlier:
 `nodetool status` output:

    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address         Load       Tokens  Owns   Host ID                       
        Rack
    UN  A (Good)        252.37 GB  256     23.0%  
9cd2e58c-a062-48a4-8d3f-b7bd9ee0576f  rack1
    UN  B (Good)        245.91 GB  256     24.4%  
6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1
    UN  C (Good)        254.79 GB  256     23.7%  
f4891729-9179-4f19-ab2c-50d387da7ac6  rack1
    UN  D (Bad)         163.85 GB  256     28.8%  
faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1



ThanksAnuj
Sent from Yahoo Mail on Android 
 
 On Wed, 13 Jan, 2016 at 10:34 pm, James 
Griffin<james.grif...@idioplatform.com> wrote:   Hi all, 
We’ve spent a few days running things but are in the same position. To add some 
more flavour:
   
   - We have a 3-node ring, replication factor = 3. We’ve been running in this 
configuration for a few years without any real issues
   - Nodes 2 & 3 are much newer than node 1. These two nodes were brought in to 
replace two other nodes which had failed RAID0 configuration and thus were 
lacking in disk space.
   - When node 2 was brought into the ring, it exhibited high CPU wait, IO and 
load metrics
   - We subsequently brought 3 into the ring: as soon as 3 was fully 
bootstrapped, the load, CPU wait and IO stats on 2 dropped to normal levels. 
Those same stats on 3, however, sky-rocketed
   - We’ve confirmed configuration across all three nodes are identical and in 
line with the recommended production settings
   - We’ve run a full repair
   - Node 2 is currently running compactions, 1 & 3 aren’t and have no pending
   - There is no GC happening from what I can see. Node 1 has a GC log, but 
that’s not been written to since May last year

What we’re seeing at the moment is similar and normal stats on nodes 1 & 2, but 
high CPU wait, IO and load stats on 3. As a snapshot:
   
   - Load: 3.96, CPU wait: 30.8%, Disk Read Ops: 408/s
   - Load: 5.88, CPU wait: 14.6%, Disk Read Ops: 275/s 
   - Load: 58.15, CPU wait: 87.0%, Disk Read Ops: 2,408/s 

Can you recommend any next steps? 
Griff

On 6 January 2016 at 17:31, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote:

Hi Vickrum,
I would have proceeded with diagnosis as follows:
1. Analysis of sar report to check system health -cpu memory swap disk etc. 
System seems to be overloaded. This is evident from mutation drops.
2. Make sure that  all recommended Cassandra production settings available at 
Datastax site are applied ,disable zone reclaim and THP.
3.Run full Repair on bad node and check data size. Node is owner of maximum 
token range but has significant lower data.I doubt that bootstrapping happened 
properly.
4.Compactionstats shows 22 pending compactions. Try throttling compactions via 
reducing cincurent compactors or compaction throughput.
5.Analyze logs to make sure bootstrapping happened without errors.
6. Look for other common performance problems such as GC pauses to make sure 
that dropped mutations are not caused by GC pauses.

ThanksAnuj

Sent from Yahoo Mail on Android 
 
 On Wed, 6 Jan, 2016 at 10:12 pm, Vickrum Loi<vickrum....@idioplatform.com> 
wrote:   # nodetool compactionstats
pending tasks: 22
          compaction type        keyspace           table       completed       
    total      unit  progress
               Compactionproduction_analytics    interactions       240410213   
 161172668724     bytes     0.15%
               Compactionproduction_decisionsdecisions.decisions_q_idx       
120815385       226295183     bytes    53.39%
Active compaction remaining time :   2h39m58s

Worth mentioning that compactions haven't been running on this node 
particularly often. The node's been performing badly regardless of whether it's 
compacting or not.

On 6 January 2016 at 16:35, Jeff Ferland <j...@tubularlabs.com> wrote:

What’s your output of `nodetool compactionstats`?

On Jan 6, 2016, at 7:26 AM, Vickrum Loi <vickrum....@idioplatform.com> wrote:
Hi,

We recently added a new node to our cluster in order to replace a node that 
died (hardware failure we believe). For the next two weeks it had high disk and 
network activity. We replaced the server, but it's happened again. We've looked 
into memory allowances, disk performance, number of connections, and all the 
nodetool stats, but can't find the cause of the issue.

`nodetool tpstats`[0] shows a lot of active and pending threads, in comparison 
to the rest of the cluster, but that's likely a symptom, not a cause.

`nodetool status`[1] shows the cluster isn't quite balanced. The bad node (D) 
has less data.

Disk Activity[2] and Network activity[3] on this node is far higher than the 
rest.

The only other difference this node has to the rest of the cluster is that its 
on the ext4 filesystem, whereas the rest are ext3, but we've done plenty of 
testing there and can't see how that would affect performance on this node so 
much.

Nothing of note in system.log.

What should our next step be in trying to diagnose this issue?

Best wishes,
Vic

[0] `nodetool tpstats` output:

Good node:
    Pool Name                    Active   Pending      Completed   Blocked  All 
time blocked
    ReadStage                         0         0       46311521         0      
           0
    RequestResponseStage              0         0       23817366         0      
           0
    MutationStage                     0         0       47389269         0      
           0
    ReadRepairStage                   0         0          11108         0      
           0
    ReplicateOnWriteStage             0         0              0         0      
           0
    GossipStage                       0         0        5259908         0      
           0
    CacheCleanupExecutor              0         0              0         0      
           0
    MigrationStage                    0         0             30         0      
           0
    MemoryMeter                       0         0          16563         0      
           0
    FlushWriter                       0         0          39637         0      
          26
    ValidationExecutor                0         0          19013         0      
           0
    InternalResponseStage             0         0              9         0      
           0
    AntiEntropyStage                  0         0          38026         0      
           0
    MemtablePostFlusher               0         0          81740         0      
           0
    MiscStage                         0         0          19196         0      
           0
    PendingRangeCalculator            0         0             23         0      
           0
    CompactionExecutor                0         0          61629         0      
           0
    commitlog_archiver                0         0              0         0      
           0
    HintedHandoff                     0         0             63         0      
           0

    Message type           Dropped
    RANGE_SLICE                  0
    READ_REPAIR                  0
    PAGED_RANGE                  0
    BINARY                       0
    READ                       640
    MUTATION                     0
    _TRACE                       0
    REQUEST_RESPONSE             0
    COUNTER_MUTATION             0

Bad node:
    Pool Name                    Active   Pending      Completed   Blocked  All 
time blocked
    ReadStage                        32       113          52216         0      
           0
    RequestResponseStage              0         0           4167         0      
           0
    MutationStage                     0         0         127559         0      
           0
    ReadRepairStage                   0         0            125         0      
           0
    ReplicateOnWriteStage             0         0              0         0      
           0
    GossipStage                       0         0           9965         0      
           0
    CacheCleanupExecutor              0         0              0         0      
           0
    MigrationStage                    0         0              0         0      
           0
    MemoryMeter                       0         0             24         0      
           0
    FlushWriter                       0         0             27         0      
           1
    ValidationExecutor                0         0              0         0      
           0
    InternalResponseStage             0         0              0         0      
           0
    AntiEntropyStage                  0         0              0         0      
           0
    MemtablePostFlusher               0         0             96         0      
           0
    MiscStage                         0         0              0         0      
           0
    PendingRangeCalculator            0         0             10         0      
           0
    CompactionExecutor                1         1             73         0      
           0
    commitlog_archiver                0         0              0         0      
           0
    HintedHandoff                     0         0             15         0      
           0

    Message type           Dropped
    RANGE_SLICE                130
    READ_REPAIR                  1
    PAGED_RANGE                  0
    BINARY                       0
    READ                     31032
    MUTATION                   865
    _TRACE                       0
    REQUEST_RESPONSE             7
    COUNTER_MUTATION             0


[1] `nodetool status` output:

    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address         Load       Tokens  Owns   Host ID                       
        Rack
    UN  A (Good)        252.37 GB  256     23.0%  
9cd2e58c-a062-48a4-8d3f-b7bd9ee0576f  rack1
    UN  B (Good)        245.91 GB  256     24.4%  
6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1
    UN  C (Good)        254.79 GB  256     23.7%  
f4891729-9179-4f19-ab2c-50d387da7ac6  rack1
    UN  D (Bad)         163.85 GB  256     28.8%  
faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1

[2] Disk read/write ops:

    
https://s3-eu-west-1.amazonaws.com/uploads-eu.hipchat.com/28299/178477/dRs4jV1ukMeFHGE/cass-disk-read-ops.png
    
https://s3-eu-west-1.amazonaws.com/uploads-eu.hipchat.com/28299/178477/gbE58N2WosiOomF/cass-disk-write-ops.png

[3] Network in/out:

    
https://s3-eu-west-1.amazonaws.com/uploads-eu.hipchat.com/28299/178477/RwOVdUBxu6fPLgF/cass-network-in.png
    
https://s3-eu-west-1.amazonaws.com/uploads-eu.hipchat.com/28299/178477/OpZM6ypNVN0O30q/cass-network-out.png

Re: New node has high network and disk usage.

Reply via email to