Re: RE: RE: [EXTERNAL] Cluster is unbalanced

Joshua Galbraith Wed, 20 Jun 2018 09:18:56 -0700

Okay, that string appears to be a base64-encoded version 4 UUID. Why not
use Cassandra's UUID data type to store that directly rather than storing
the longer base64 string as text? What does the UUID represent? Is it
identifying a unique product, an image, or some other type of object? When
and how is the underlying UUID being generated by the application?


I assume you're using the default partitioner, but just in case, can you
confirm which partitioner you're using in your cassandra.yaml file (e.g.
Murmer3, Random, ByteOrdered)?

Also, please have a look at these two issues and verify you're not
experiencing either:

* https://issues.apache.org/jira/browse/CASSANDRA-7032
* https://issues.apache.org/jira/browse/CASSANDRA-10430


On Wed, Jun 20, 2018 at 9:59 AM, learner dba <cassandra...@yahoo.com.invalid
> wrote:

> Partition key has value as:
>
> MWY4MmI0MTQtYTk2YS00YmRjLTkxNDMtOWU0MjM1OWU2NzUy other column is blob.
>
> On Tuesday, June 19, 2018, 6:07:59 PM EDT, Joshua Galbraith <
> jgalbra...@newrelic.com.INVALID> wrote:
>
>
> > id text PRIMARY KEY
>
> What values are written to this id field? Can you give us some examples or
> explain the general use case?
>
> On Tue, Jun 19, 2018 at 1:18 PM, learner dba <cassandra...@yahoo.com.
> invalid> wrote:
>
> Hi Sean,
>
> Here is create table:
>
> CREATE TABLE ks.cf (
>
>     id text PRIMARY KEY,
>
>     accessdata blob
>
> ) WITH bloom_filter_fp_chance = 0.01
>
>     AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>
>     AND comment = ''
>
>     AND compaction = {'class': 'org.apache.cassandra.db. compaction.
> SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
>
>     AND compression = {'chunk_length_in_kb': '64', 'class': '
> org.apache.cassandra.io. compress.LZ4Compressor'}
>
>     AND crc_check_chance = 1.0
>
>     AND dclocal_read_repair_chance = 0.1
>
>     AND default_time_to_live = 0
>
>     AND gc_grace_seconds = 864000
>
>     AND max_index_interval = 2048
>
>     AND memtable_flush_period_in_ms = 0
>
>     AND min_index_interval = 128
>
>     AND read_repair_chance = 0.0
>
>     AND speculative_retry = '99PERCENTILE';
> Nodetool status:
>
> Datacenter: dc1
>
> =======================
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/ Moving
>
> --  Address     Load       Tokens       Owns (effective)  Host ID
>                       Rack
>
> UN  xxxxx   20.66 GiB  256          61.4%             f4f54949-83c9-419b-9a43-
> cb630b36d8c2  RAC1
>
> UN  xxxxx  65.77 GiB  256          59.3%             3db430ae-45ef-4746-a273-
> bc1f66ac8981  RAC1
>
> UN  xxxxxx  60.58 GiB  256          58.4%             1f23e869-1823-4b75-8d3e-
> f9b32acba9a6  RAC1
>
> UN  xxxxx  47.08 GiB  256          57.5%             7aca9a36-823f-4185-be44-
> c1464a799084  RAC1
>
> UN  xxxxx  51.47 GiB  256          63.4%             18cff010-9b83-4cf8-9dc2-
> f05ac63df402  RAC1
>
> Datacenter: dc2
>
> ========================
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/ Moving
>
> --  Address     Load       Tokens       Owns (effective)  Host ID
>                       Rack
>
> UN  xxxx   24.37 GiB  256          59.5%             1b694180-210a-4b75-8f2a-
> 748f4a5b6a3d  RAC1
>
> UN  xxxxx 30.76 GiB  256          56.7%             597bac04-c57a-4487-8924-
> 72e171e45514  RAC1
>
> UN  xxxx  10.73 GiB  256          63.9%             6e7e474e-e292-4433-afd4-
> 372d30e0f3e1  RAC1
>
> UN  xxxxxx 19.77 GiB  256          61.5%             58751418-7b76-40f7-8b8f-
> a5bf8fe7d9a2  RAC1
>
> UN  xxxxx  10.33 GiB  256          58.4%             6d58d006-2095-449c-8c67-
> 50e8cbdfe7a7  RAC1
>
>
> cassandra-rackdc.properties:
>
> dc=dc1
> rack=RAC1 --> same in all nodes
>
> cassandra.yaml:
> num_tokens: 256
>
> endpoint_snitch: GossipingPropertyFileSnitch
> I can see cassandra-topology.properties, I believe it shouldn't be there
> with GossipPropertyFileSnitch. Can this file be causing any trouble in data
> distribution.
>
> cat /opt/cassandra/conf/cassandra- topology.properties
>
> # Licensed to the Apache Software Foundation (ASF) under one
>
> # or more contributor license agreements.  See the NOTICE file
>
> # distributed with this work for additional information
>
> # regarding copyright ownership.  The ASF licenses this file
>
> # to you under the Apache License, Version 2.0 (the
>
> # "License"); you may not use this file except in compliance
>
> # with the License.  You may obtain a copy of the License at
>
> #
>
> #     http://www.apache.org/ licenses/LICENSE-2.0
> <http://www.apache.org/licenses/LICENSE-2.0>
>
> #
>
> # Unless required by applicable law or agreed to in writing, software
>
> # distributed under the License is distributed on an "AS IS" BASIS,
>
> # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
>
> # See the License for the specific language governing permissions and
>
> # limitations under the License.
>
>
> # Cassandra Node IP=Data Center:Rack
>
> 192.168.1.100=DC1:RAC1
>
> 192.168.2.200=DC2:RAC2
>
>
> 10.0.0.10=DC1:RAC1
>
> 10.0.0.11=DC1:RAC1
>
> 10.0.0.12=DC1:RAC2
>
>
> 10.20.114.10=DC2:RAC1
>
> 10.20.114.11=DC2:RAC1
>
>
> 10.21.119.13=DC3:RAC1
>
> 10.21.119.10=DC3:RAC1
>
>
> 10.0.0.13=DC1:RAC2
>
> 10.21.119.14=DC3:RAC2
>
> 10.20.114.15=DC2:RAC2
>
>
> # default for unknown nodes
>
> default=DC1:r1
>
>
> # Native IPv6 is supported, however you must escape the colon in the IPv6
> Address
>
> # Also be sure to comment out JVM_OPTS="$JVM_OPTS
> -Djava.net.preferIPv4Stack= true"
>
> # in cassandra-env.sh
>
> fe80\:0\:0\:0\:202\:b3ff\: fe1e\:8329=DC1:RAC3
>
>
>
>
> On Tuesday, June 19, 2018, 12:51:34 PM EDT, Durity, Sean R <
> sean_r_dur...@homedepot.com> wrote:
>
>
> You are correct that the cluster decides where data goes (based on the
> hash of the partition key). However, if you choose a “bad” partition key,
> you may not get good distribution of the data, because the hash is
> deterministic (it always goes to the same nodes/replicas). For example, if
> you have a partition key of a datetime, it is possible that there is more
> data written for a certain time period – thus a larger partition and an
> imbalance across the cluster. Choosing a “good” partition key is one of the
> most important decisions for a Cassandra table.
>
>
>
> Also, I have seen the use of racks in the topology cause an imbalance in
> the “first” node of the rack.
>
>
>
> To help you more, we would need the create table statement(s) for your
> keyspace and the topology of the cluster (like with nodetool status).
>
>
>
>
>
> Sean Durity
>
> *From:* learner dba <cassandra...@yahoo.com. INVALID>
> *Sent:* Tuesday, June 19, 2018 9:50 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: RE: [EXTERNAL] Cluster is unbalanced
>
>
>
> We do not chose the node where partition will go. I thought it is snitch's
> role to chose replica nodes. Even the partition size does not vary on our
> largest column family:
>
> Percentile  SSTables     Write Latency      Read Latency    Partition Size
>       Cell Count
>
>                               (micros)          (micros)           (bytes)
>
>
> 50%             0.00             17.08             61.21              3311
>                 1
>
> 75%             0.00             20.50             88.15              3973
>                 1
>
> 95%             0.00             35.43            105.78              3973
>                 1
>
> 98%             0.00             42.51            126.93              3973
>                 1
>
> 99%             0.00             51.01            126.93              3973
>                 1
>
> Min             0.00              3.97             17.09                61
>
>
> Max             0.00             73.46            126.93             11864
>                 1
>
>
>
> We are kinda stuck here to identify, what could be causing this un-balance.
>
>
>
> On Tuesday, June 19, 2018, 7:15:28 AM EDT, Joshua Galbraith <
> jgalbra...@newrelic.com. INVALID> wrote:
>
>
>
>
>
> >If it was partition key issue, we would see similar number of partition
> keys across nodes. If we look closely number of keys across nodes vary a
> lot.
>
> I'm not sure about that, is it possible you're writing more new partitions
> to some nodes even though each node owns the same number of tokens?
>
> [image: Image removed by sender.]
>
>
>
> On Mon, Jun 18, 2018 at 6:07 PM, learner dba <cassandra...@yahoo.com.
> invalid <cassandra...@yahoo.com.invalid>> wrote:
>
> Hi Sean,
>
>
>
> Are you using any rack aware topology? --> we are using gossip file
>
> Are you using any rack aware topology? --> we are using gossip file
>
>  What are your partition keys? --> Partition key is uniq
>
> Is it possible that your partition keys do not divide up as cleanly as you
> would like across the cluster because the data is not evenly distributed
> (by partition key)?  --> No, we verified it.
>
>
>
> If it was partition key issue, we would see similar number of partition
> keys across nodes. If we look closely number of keys across nodes vary a
> lot.
>
>
>
>
>
> Number of partitions (estimate): 3142552
>
> Number of partitions (estimate): 15625442
>
> Number of partitions (estimate): 15244021
>
> Number of partitions (estimate): 9592992
>
> Number of partitions (estimate): 15839280
>
>
>
>
>
>
>
>
>
>
>
> On Monday, June 18, 2018, 5:39:08 PM EDT, Durity, Sean R <
> sean_r_dur...@homedepot.com> wrote:
>
>
>
>
>
> Are you using any rack aware topology? What are your partition keys? Is it
> possible that your partition keys do not divide up as cleanly as you would
> like across the cluster because the data is not evenly distributed (by
> partition key)?
>
>
>
>
>
> Sean Durity
>
> lord of the (C*) rings (Staff Systems Engineer – Cassandra)
>
> MTC 2250
>
> #cassandra - for the latest news and updates
>
>
>
> *From:* learner dba <cassandra...@yahoo.com. INVALID>
> *Sent:* Monday, June 18, 2018 2:06 PM
> *To:* User cassandra.apache.org
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra.apache.org&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=8q4p6nWedWQJ9gpXCnoa6KR4HRmSf3B1whdYKNFub6M&s=TmzIaVextVyZy81p9JuU7R6PFv84RfhgtEezCe063V0&e=>
> <user@cassandra.apache.org>
> *Subject:* [EXTERNAL] Cluster is unbalanced
>
>
>
> Hi,
>
>
>
> Data volume varies a lot in our two DC cluster:
>
>  Load       Tokens       Owns
>
>  20.01 GiB  256          ?
>
>  65.32 GiB  256          ?
>
>  60.09 GiB  256          ?
>
>  46.95 GiB  256          ?
>
>  50.73 GiB  256          ?
>
> kaiprodv2
>
> =========
>
> /Leaving/Joining/Moving
>
>  Load       Tokens       Owns
>
>  25.19 GiB  256          ?
>
>  30.26 GiB  256          ?
>
>  9.82 GiB   256          ?
>
>  20.54 GiB  256          ?
>
>  9.7 GiB    256          ?
>
>
>
> I ran clearsnapshot, garbagecollect and cleanup, but it increased the size
> on heavier nodes instead of decreasing. Based on nodetool cfstats, I can
> see partition keys on each node varies a lot:
>
>
>
> Number of partitions (estimate): 3142552
>
> Number of partitions (estimate): 15625442
>
> Number of partitions (estimate): 15244021
>
> Number of partitions (estimate): 9592992
>
> Number of partitions (estimate): 15839280
>
>
>
> How can I diagnose this imbalance further?
>
>
>
>
>
>
>
> --
>
> *Joshua Galbraith *| Senior Software Engineer | New Relic
> C: 907-209-1208 | jgalbraith@ newrelic.com <jgalbra...@newrelic.com>
>
>
>
>
> --
> *Joshua Galbraith *| Senior Software Engineer | New Relic
>



-- 
*Joshua Galbraith *| Senior Software Engineer | New Relic

Re: RE: RE: [EXTERNAL] Cluster is unbalanced

Reply via email to