We do not chose the node where partition will go. I thought it is snitch's 
role to chose replica nodes. Even the partition size does not vary on our 
largest column family:


Percentile  SSTables     Write Latency      Read Latency    Partition Size      
  Cell Count

                              (micros)          (micros)           (bytes)      
            

50%             0.00             17.08             61.21              3311      
           1

75%             0.00             20.50             88.15              3973      
           1

95%             0.00             35.43            105.78              3973      
           1

98%             0.00             42.51            126.93              3973      
           1

99%             0.00             51.01            126.93              3973      
           1

Min             0.00              3.97             17.09                61      
           0

Max             0.00             73.46            126.93             11864      
           1

We are kinda stuck here to identify, what could be causing this un-balance.
    On Tuesday, June 19, 2018, 7:15:28 AM EDT, Joshua Galbraith 
<jgalbra...@newrelic.com.INVALID> wrote:  
 
 >If it was partition key issue, we would see similar number of partition keys 
 >across nodes. If we look closely number of keys across nodes vary a lot.

I'm not sure about that, is it possible you're writing more new partitions to 
some nodes even though each node owns the same number of tokens?

On Mon, Jun 18, 2018 at 6:07 PM, learner dba <cassandra...@yahoo.com.invalid> 
wrote:

 Hi Sean,
Are you using any rack aware topology? --> we are using gossip file
 Are you using any rack aware topology? --> we are using gossip file What are 
your partition keys? --> Partition key is uniqIs it possible that your 
partition keys do not divide up as cleanly as you would like across the cluster 
because the data is not evenly distributed (by partition key)?  --> No, we 
verified it.
If it was partition key issue, we would see similar number of partition keys 
across nodes. If we look closely number of keys across nodes vary a lot.

Number of partitions (estimate): 3142552Number of partitions (estimate): 
15625442Number of partitions (estimate): 15244021Number of partitions 
(estimate): 9592992Number of partitions (estimate): 15839280




    On Monday, June 18, 2018, 5:39:08 PM EDT, Durity, Sean R 
<sean_r_dur...@homedepot.com> wrote:  
 
 
Are you using any rack aware topology? What are your partition keys? Is it 
possible that your partition keys do not divide up as cleanly as you would like 
across the cluster because the data is not evenly distributed (by partition 
key)?
 
  
 
  
 
Sean Durity
 
lord of the (C*) rings (Staff Systems Engineer – Cassandra)
 
MTC 2250
 
#cassandra - for the latest news and updates
 
  
 
From: learner dba <cassandra...@yahoo.com. INVALID>
Sent: Monday, June 18, 2018 2:06 PM
To: User cassandra.apache.org <user@cassandra.apache.org>
Subject: [EXTERNAL] Cluster is unbalanced
 
  
 
Hi,
 
  
 
Data volume varies a lot in our two DC cluster:
 
 Load     Tokens     Owns 
 
 20.01 GiB 256         ?     
 
 65.32 GiB 256         ?     
 
 60.09 GiB 256         ?     
 
 46.95 GiB 256         ?     
 
 50.73 GiB 256         ?     
 
kaiprodv2
 
=========
 
/Leaving/Joining/Moving
 
 Load     Tokens     Owns 
 
 25.19 GiB 256         ?     
 
 30.26 GiB 256         ?     
 
 9.82 GiB 256         ?     
 
 20.54 GiB 256         ?     
 
 9.7 GiB    256         ?     
 
  
 
I ran clearsnapshot, garbagecollect and cleanup, but it increased the size on 
heavier nodes instead of decreasing. Based on nodetool cfstats, I can see 
partition keys on each node varies a lot:
 
  
 
Number of partitions (estimate): 3142552
 
Number of partitions (estimate): 15625442
 
Number of partitions (estimate): 15244021
 
Number of partitions (estimate): 9592992
 
Number of partitions (estimate): 15839280
 
  
 
How can I diagnose this imbalance further?
 
  
   



-- 
Joshua Galbraith | Senior Software Engineer | New Relic
C: 907-209-1208 | jgalbra...@newrelic.com
  

Reply via email to