Re: Cluster is unbalanced

2018-06-20 Thread learner dba
 
CREATE KEYSPACE data WITH replication = {'class': 'NetworkTopologyStrategy', 
'dc1': '3', 'dc2': '3'}  AND durable_writes = true;


cqlsh> select * from system_schema.keyspaces ;




 keyspace_name      | durable_writes | replication

++

               apim |           True | {'class': 
'org.apache.cassandra.locator.NetworkTopologyStrategy', 'iwakaluaint': '3', 
'maikaiprodv2': '3'}

        system_auth |           True |                        {'class': 
'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '1'}

         eventspace |           True | {'class': 
'org.apache.cassandra.locator.NetworkTopologyStrategy', 'iwakaluaint': '3', 
'maikaiprodv2': '3'}

      system_schema |           True |                                          
          {'class': 'org.apache.cassandra.locator.LocalStrategy'}

            metrics |           True | {'class': 
'org.apache.cassandra.locator.NetworkTopologyStrategy', 'iwakaluaint': '3', 
'maikaiprodv2': '3'}

                sim |           True | {'class': 
'org.apache.cassandra.locator.NetworkTopologyStrategy', 'iwakaluaint': '3', 
'maikaiprodv2': '3'}

            billing |           True | {'class': 
'org.apache.cassandra.locator.NetworkTopologyStrategy', 'iwakaluaint': '1', 
'maikaiprodv2': '1'}

 system_distributed |           True | {'class': 
'org.apache.cassandra.locator.NetworkTopologyStrategy', 'iwakaluaint': '3', 
'maikaiprodv2': '3'}

             system |           True |                                          
          {'class': 'org.apache.cassandra.locator.LocalStrategy'}

              audit |           True | {'class': 
'org.apache.cassandra.locator.NetworkTopologyStrategy', 'iwakaluaint': '3', 
'maikaiprodv2': '3'}

          dakota_ks |           True | {'class': 
'org.apache.cassandra.locator.NetworkTopologyStrategy', 'iwakaluaint': '3', 
'maikaiprodv2': '3'}

        credentials |           True | {'class': 
'org.apache.cassandra.locator.NetworkTopologyStrategy', 'iwakaluaint': '3', 
'maikaiprodv2': '3'}

      system_traces |           True | {'class': 
'org.apache.cassandra.locator.NetworkTopologyStrategy', 'iwakaluaint': '3', 
'maikaiprodv2': '3'}

               data |           True | {'class': 
'org.apache.cassandra.locator.NetworkTopologyStrategy', 'iwakaluaint': '3', 
'maikaiprodv2': '3'}

Keyspaces with replica 1, are unused keyspaces. Most space is occupied by data 
and eventspace keyspaces.
On Wednesday, June 20, 2018, 12:24:18 AM EDT, anil patimidi 
 wrote:  
 
 what is your keyspace configuration. Do you have all the keyspaces configured 
for both DCs?
can you run below query from cqlsh and see if the keyspace is configured to use 
both DCs 
select * from system.schema_keyspaces;   # if your cluster is on 2.1 or 
lessselect * from system_schema.keyspaces    # for 3.0 clusters

- Anil

On Mon, Jun 18, 2018 at 11:06 AM, learner dba  
wrote:

Hi,
Data volume varies a lot in our two DC cluster:

 Load       Tokens       Owns  

 20.01 GiB  256          ?     

 65.32 GiB  256          ?     

 60.09 GiB  256          ?     

 46.95 GiB  256          ?     

 50.73 GiB  256          ?     

kaiprodv2

=

/Leaving/Joining/Moving

 Load       Tokens       Owns  

 25.19 GiB  256          ?     

 30.26 GiB  256          ?     

 9.82 GiB   256          ?     

 20.54 GiB  256          ?     

 9.7 GiB    256          ?     

I ran clearsnapshot, garbagecollect and cleanup, but it increased the size on 
heavier nodes instead of decreasing. Based on nodetool cfstats, I can see 
partition keys on each node varies a lot:

Number of partitions (estimate): 3142552

Number of partitions (estimate): 15625442

Number of partitions (estimate): 15244021

Number of partitions (estimate): 9592992
Number of partitions (estimate): 15839280
How can I diagnose this imbalance further?


  

Re: RE: RE: [EXTERNAL] Cluster is unbalanced

2018-06-20 Thread learner dba
 Partition key has value as: 
MWY4MmI0MTQtYTk2YS00YmRjLTkxNDMtOWU0MjM1OWU2NzUy other column is blob.

On Tuesday, June 19, 2018, 6:07:59 PM EDT, Joshua Galbraith 
 wrote:  
 
 > id text PRIMARY KEY

What values are written to this id field? Can you give us some examples or 
explain the general use case?
On Tue, Jun 19, 2018 at 1:18 PM, learner dba  
wrote:

 Hi Sean,
Here is create table:

CREATE TABLE ks.cf (

    id text PRIMARY KEY,

    accessdata blob

) WITH bloom_filter_fp_chance = 0.01

    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}

    AND comment = ''

    AND compaction = {'class': 'org.apache.cassandra.db. compaction. 
SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}

    AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io. compress.LZ4Compressor'}

    AND crc_check_chance = 1.0

    AND dclocal_read_repair_chance = 0.1

    AND default_time_to_live = 0

    AND gc_grace_seconds = 864000

    AND max_index_interval = 2048

    AND memtable_flush_period_in_ms = 0

    AND min_index_interval = 128

    AND read_repair_chance = 0.0

    AND speculative_retry = '99PERCENTILE';
Nodetool status: 
Datacenter: dc1

===

Status=Up/Down

|/ State=Normal/Leaving/Joining/ Moving

--  Address     Load       Tokens       Owns (effective)  Host ID               
                Rack

UN  x   20.66 GiB  256          61.4%             f4f54949-83c9-419b-9a43- 
cb630b36d8c2  RAC1

UN  x  65.77 GiB  256          59.3%             3db430ae-45ef-4746-a273- 
bc1f66ac8981  RAC1

UN  xx  60.58 GiB  256          58.4%             1f23e869-1823-4b75-8d3e- 
f9b32acba9a6  RAC1

UN  x  47.08 GiB  256          57.5%             7aca9a36-823f-4185-be44- 
c1464a799084  RAC1

UN  x  51.47 GiB  256          63.4%             18cff010-9b83-4cf8-9dc2- 
f05ac63df402  RAC1

Datacenter: dc2



Status=Up/Down

|/ State=Normal/Leaving/Joining/ Moving

--  Address     Load       Tokens       Owns (effective)  Host ID               
                Rack

UN     24.37 GiB  256          59.5%             1b694180-210a-4b75-8f2a- 
748f4a5b6a3d  RAC1

UN  x 30.76 GiB  256          56.7%             597bac04-c57a-4487-8924- 
72e171e45514  RAC1

UN    10.73 GiB  256          63.9%             6e7e474e-e292-4433-afd4- 
372d30e0f3e1  RAC1

UN  xx 19.77 GiB  256          61.5%             58751418-7b76-40f7-8b8f- 
a5bf8fe7d9a2  RAC1

UN  x  10.33 GiB  256          58.4%             6d58d006-2095-449c-8c67- 
50e8cbdfe7a7  RAC1


cassandra-rackdc.properties:

dc=dc1
rack=RAC1 --> same in all nodes
cassandra.yaml:
num_tokens: 256
endpoint_snitch: GossipingPropertyFileSnitch
I can see cassandra-topology.properties, I believe it shouldn't be there with 
GossipPropertyFileSnitch. Can this file be causing any trouble in data 
distribution.

cat /opt/cassandra/conf/cassandra- topology.properties 

# Licensed to the Apache Software Foundation (ASF) under one

# or more contributor license agreements.  See the NOTICE file

# distributed with this work for additional information

# regarding copyright ownership.  The ASF licenses this file

# to you under the Apache License, Version 2.0 (the

# "License"); you may not use this file except in compliance

# with the License.  You may obtain a copy of the License at

#

#     http://www.apache.org/ licenses/LICENSE-2.0

#

# Unless required by applicable law or agreed to in writing, software

# distributed under the License is distributed on an "AS IS" BASIS,

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

# See the License for the specific language governing permissions and

# limitations under the License.




# Cassandra Node IP=Data Center:Rack

192.168.1.100=DC1:RAC1

192.168.2.200=DC2:RAC2




10.0.0.10=DC1:RAC1

10.0.0.11=DC1:RAC1

10.0.0.12=DC1:RAC2




10.20.114.10=DC2:RAC1

10.20.114.11=DC2:RAC1




10.21.119.13=DC3:RAC1

10.21.119.10=DC3:RAC1




10.0.0.13=DC1:RAC2

10.21.119.14=DC3:RAC2

10.20.114.15=DC2:RAC2




# default for unknown nodes

default=DC1:r1




# Native IPv6 is supported, however you must escape the colon in the IPv6 
Address

# Also be sure to comment out JVM_OPTS="$JVM_OPTS -Djava.net.preferIPv4Stack= 
true"

# in cassandra-env.sh

fe80\:0\:0\:0\:202\:b3ff\: fe1e\:8329=DC1:RAC3






On Tuesday, June 19, 2018, 12:51:34 PM EDT, Durity, Sean R 
 wrote:  
 
  
You are correct that the cluster decides where data goes (based on the hash of 
the partition key). However, if you choose a “bad” partition key, you may not 
get good distribution of the data, because the hash is deterministic (it always 
goes to the same nodes/replicas). For example, if you have a partition key of a 
datetime, it is possible that there is more data written for a certain time 
period – thus a larger partition and an imbalance across the cluster. Choosing 
a “good” partition key is one of the most important decisions

Re: RE: RE: [EXTERNAL] Cluster is unbalanced

2018-06-20 Thread Joshua Galbraith
Okay, that string appears to be a base64-encoded version 4 UUID. Why not
use Cassandra's UUID data type to store that directly rather than storing
the longer base64 string as text? What does the UUID represent? Is it
identifying a unique product, an image, or some other type of object? When
and how is the underlying UUID being generated by the application?

I assume you're using the default partitioner, but just in case, can you
confirm which partitioner you're using in your cassandra.yaml file (e.g.
Murmer3, Random, ByteOrdered)?

Also, please have a look at these two issues and verify you're not
experiencing either:

* https://issues.apache.org/jira/browse/CASSANDRA-7032
* https://issues.apache.org/jira/browse/CASSANDRA-10430


On Wed, Jun 20, 2018 at 9:59 AM, learner dba  wrote:

> Partition key has value as:
>
> MWY4MmI0MTQtYTk2YS00YmRjLTkxNDMtOWU0MjM1OWU2NzUy other column is blob.
>
> On Tuesday, June 19, 2018, 6:07:59 PM EDT, Joshua Galbraith <
> jgalbra...@newrelic.com.INVALID> wrote:
>
>
> > id text PRIMARY KEY
>
> What values are written to this id field? Can you give us some examples or
> explain the general use case?
>
> On Tue, Jun 19, 2018 at 1:18 PM, learner dba  invalid> wrote:
>
> Hi Sean,
>
> Here is create table:
>
> CREATE TABLE ks.cf (
>
> id text PRIMARY KEY,
>
> accessdata blob
>
> ) WITH bloom_filter_fp_chance = 0.01
>
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>
> AND comment = ''
>
> AND compaction = {'class': 'org.apache.cassandra.db. compaction.
> SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
>
> AND compression = {'chunk_length_in_kb': '64', 'class': '
> org.apache.cassandra.io. compress.LZ4Compressor'}
>
> AND crc_check_chance = 1.0
>
> AND dclocal_read_repair_chance = 0.1
>
> AND default_time_to_live = 0
>
> AND gc_grace_seconds = 864000
>
> AND max_index_interval = 2048
>
> AND memtable_flush_period_in_ms = 0
>
> AND min_index_interval = 128
>
> AND read_repair_chance = 0.0
>
> AND speculative_retry = '99PERCENTILE';
> Nodetool status:
>
> Datacenter: dc1
>
> ===
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/ Moving
>
> --  Address Load   Tokens   Owns (effective)  Host ID
>   Rack
>
> UN  x   20.66 GiB  256  61.4% f4f54949-83c9-419b-9a43-
> cb630b36d8c2  RAC1
>
> UN  x  65.77 GiB  256  59.3% 3db430ae-45ef-4746-a273-
> bc1f66ac8981  RAC1
>
> UN  xx  60.58 GiB  256  58.4% 1f23e869-1823-4b75-8d3e-
> f9b32acba9a6  RAC1
>
> UN  x  47.08 GiB  256  57.5% 7aca9a36-823f-4185-be44-
> c1464a799084  RAC1
>
> UN  x  51.47 GiB  256  63.4% 18cff010-9b83-4cf8-9dc2-
> f05ac63df402  RAC1
>
> Datacenter: dc2
>
> 
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/ Moving
>
> --  Address Load   Tokens   Owns (effective)  Host ID
>   Rack
>
> UN     24.37 GiB  256  59.5% 1b694180-210a-4b75-8f2a-
> 748f4a5b6a3d  RAC1
>
> UN  x 30.76 GiB  256  56.7% 597bac04-c57a-4487-8924-
> 72e171e45514  RAC1
>
> UN    10.73 GiB  256  63.9% 6e7e474e-e292-4433-afd4-
> 372d30e0f3e1  RAC1
>
> UN  xx 19.77 GiB  256  61.5% 58751418-7b76-40f7-8b8f-
> a5bf8fe7d9a2  RAC1
>
> UN  x  10.33 GiB  256  58.4% 6d58d006-2095-449c-8c67-
> 50e8cbdfe7a7  RAC1
>
>
> cassandra-rackdc.properties:
>
> dc=dc1
> rack=RAC1 --> same in all nodes
>
> cassandra.yaml:
> num_tokens: 256
>
> endpoint_snitch: GossipingPropertyFileSnitch
> I can see cassandra-topology.properties, I believe it shouldn't be there
> with GossipPropertyFileSnitch. Can this file be causing any trouble in data
> distribution.
>
> cat /opt/cassandra/conf/cassandra- topology.properties
>
> # Licensed to the Apache Software Foundation (ASF) under one
>
> # or more contributor license agreements.  See the NOTICE file
>
> # distributed with this work for additional information
>
> # regarding copyright ownership.  The ASF licenses this file
>
> # to you under the Apache License, Version 2.0 (the
>
> # "License"); you may not use this file except in compliance
>
> # with the License.  You may obtain a copy of the License at
>
> #
>
> # http://www.apache.org/ licenses/LICENSE-2.0
> 
>
> #
>
> # Unless required by applicable law or agreed to in writing, software
>
> # distributed under the License is distributed on an "AS IS" BASIS,
>
> # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
>
> # See the License for the specific language governing permissions and
>
> # limitations under the License.
>
>
> # Cassandra Node IP=Data Center:Rack
>
> 192.168.1.100=DC1:RAC1
>
> 192.168.2.200=DC2:RAC2
>
>
> 10.0.0.10=DC1:RAC1
>
> 10.0.0.11=DC1:RAC1
>
> 10

Re: RE: RE: [EXTERNAL] Cluster is unbalanced

2018-06-20 Thread learner dba
 
Hi Joshua,
Okay, that string appears to be a base64-encoded version 4 UUID. Why not use 
Cassandra's UUID data type to store that directly rather than storing the 
longer base64 string as text?  --> It's an old application and the person who 
coded it, has left the company.What does the UUID represent? --> Unique account 
id.Is it identifying a unique product, an image, or some other type of object? 
--> yesWhen and how is the underlying UUID being generated by the application? 
--> Not sure about it. 

I assume you're using the default partitioner, but just in case, can you 
confirm which partitioner you're using in your cassandra.yaml file (e.g. 
Murmer3, Random, ByteOrdered)? --> partitioner: 
org.apache.cassandra.dht.Murmur3Partitioner


Mentioned Jiras are from much older version than ours "3.11.2"; Also, our 
partition keys are not distributed evenly as I had pasted output earlier. Which 
means none of the Jiras apply in our case :(


On Wednesday, June 20, 2018, 12:18:28 PM EDT, Joshua Galbraith 
 wrote:  
 
 Okay, that string appears to be a base64-encoded version 4 UUID. Why not use 
Cassandra's UUID data type to store that directly rather than storing the 
longer base64 string as text? What does the UUID represent? Is it identifying a 
unique product, an image, or some other type of object? When and how is the 
underlying UUID being generated by the application?

I assume you're using the default partitioner, but just in case, can you 
confirm which partitioner you're using in your cassandra.yaml file (e.g. 
Murmer3, Random, ByteOrdered)?

Also, please have a look at these two issues and verify you're not experiencing 
either:

* https://issues.apache.org/jira/browse/CASSANDRA-7032
* https://issues.apache.org/jira/browse/CASSANDRA-10430

On Wed, Jun 20, 2018 at 9:59 AM, learner dba  
wrote:

 Partition key has value as: 
MWY4MmI0MTQtYTk2YS00YmRjLTkxND MtOWU0MjM1OWU2NzUy other column is blob.

On Tuesday, June 19, 2018, 6:07:59 PM EDT, Joshua Galbraith 
 wrote:  
 
 > id text PRIMARY KEY

What values are written to this id field? Can you give us some examples or 
explain the general use case?
On Tue, Jun 19, 2018 at 1:18 PM, learner dba  
wrote:

 Hi Sean,
Here is create table:

CREATE TABLE ks.cf (

    id text PRIMARY KEY,

    accessdata blob

) WITH bloom_filter_fp_chance = 0.01

    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}

    AND comment = ''

    AND compaction = {'class': 'org.apache.cassandra.db. compaction. 
SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}

    AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io. compress.LZ4Compressor'}

    AND crc_check_chance = 1.0

    AND dclocal_read_repair_chance = 0.1

    AND default_time_to_live = 0

    AND gc_grace_seconds = 864000

    AND max_index_interval = 2048

    AND memtable_flush_period_in_ms = 0

    AND min_index_interval = 128

    AND read_repair_chance = 0.0

    AND speculative_retry = '99PERCENTILE';
Nodetool status: 
Datacenter: dc1

===

Status=Up/Down

|/ State=Normal/Leaving/Joining/ Moving

--  Address     Load       Tokens       Owns (effective)  Host ID               
                Rack

UN  x   20.66 GiB  256          61.4%             f4f54949-83c9-419b-9a43- 
cb630b36d8c2  RAC1

UN  x  65.77 GiB  256          59.3%             3db430ae-45ef-4746-a273- 
bc1f66ac8981  RAC1

UN  xx  60.58 GiB  256          58.4%             1f23e869-1823-4b75-8d3e- 
f9b32acba9a6  RAC1

UN  x  47.08 GiB  256          57.5%             7aca9a36-823f-4185-be44- 
c1464a799084  RAC1

UN  x  51.47 GiB  256          63.4%             18cff010-9b83-4cf8-9dc2- 
f05ac63df402  RAC1

Datacenter: dc2



Status=Up/Down

|/ State=Normal/Leaving/Joining/ Moving

--  Address     Load       Tokens       Owns (effective)  Host ID               
                Rack

UN     24.37 GiB  256          59.5%             1b694180-210a-4b75-8f2a- 
748f4a5b6a3d  RAC1

UN  x 30.76 GiB  256          56.7%             597bac04-c57a-4487-8924- 
72e171e45514  RAC1

UN    10.73 GiB  256          63.9%             6e7e474e-e292-4433-afd4- 
372d30e0f3e1  RAC1

UN  xx 19.77 GiB  256          61.5%             58751418-7b76-40f7-8b8f- 
a5bf8fe7d9a2  RAC1

UN  x  10.33 GiB  256          58.4%             6d58d006-2095-449c-8c67- 
50e8cbdfe7a7  RAC1


cassandra-rackdc.properties:

dc=dc1
rack=RAC1 --> same in all nodes
cassandra.yaml:
num_tokens: 256
endpoint_snitch: GossipingPropertyFileSnitch
I can see cassandra-topology.properties, I believe it shouldn't be there with 
GossipPropertyFileSnitch. Can this file be causing any trouble in data 
distribution.

cat /opt/cassandra/conf/cassandra- topology.properties 

# Licensed to the Apache Software Foundation (ASF) under one

# or more contributor license agreements.  See the NOTICE file

# distributed with this work for additional information

# regarding

Re: RE: RE: [EXTERNAL] Cluster is unbalanced

2018-06-20 Thread Joshua Galbraith
>Also, our partition keys are not distributed evenly as I had pasted output
earlier.

Thanks, I see that now. Can you share the full output of nodetool tablestats
 and nodetool tablehistograms?

Out of curiosity, are you running repairs on this cluster? If so, what type
of repairs are you running and how often?

One way you might differentiate between a server-side/configuration issue
or a client/data model issue is to write a script that populates a test
keyspace with uniformly distributed partitions and see if that keyspace
also exhibits a similar imbalance of partitions per node. You might be able
to use a heavily-throttled cassandra-stress invocation to handle this.

On Wed, Jun 20, 2018 at 12:32 PM, learner dba <
cassandra...@yahoo.com.invalid> wrote:

>
> Hi Joshua,
>
> Okay, that string appears to be a base64-encoded version 4 UUID.
> Why not use Cassandra's UUID data type to store that directly rather than
> storing the longer base64 string as text?  --> It's an old application and
> the person who coded it, has left the company.
> What does the UUID represent? --> Unique account id.
> Is it identifying a unique product, an image, or some other type of
> object? --> yes
> When and how is the underlying UUID being generated by the application?
> --> Not sure about it.
>
> I assume you're using the default partitioner, but just in case, can you
> confirm which partitioner you're using in your cassandra.yaml file (e.g.
> Murmer3, Random, ByteOrdered)? --> partitioner: org.apache.cassandra.dht.
> Murmur3Partitioner
>
>
> Mentioned Jiras are from much older version than ours "3.11.2"; Also, our
> partition keys are not distributed evenly as I had pasted output earlier.
> Which means none of the Jiras apply in our case :(
>
>
> On Wednesday, June 20, 2018, 12:18:28 PM EDT, Joshua Galbraith <
> jgalbra...@newrelic.com.INVALID> wrote:
>
>
> Okay, that string appears to be a base64-encoded version 4 UUID. Why not
> use Cassandra's UUID data type to store that directly rather than storing
> the longer base64 string as text? What does the UUID represent? Is it
> identifying a unique product, an image, or some other type of object? When
> and how is the underlying UUID being generated by the application?
>
> I assume you're using the default partitioner, but just in case, can you
> confirm which partitioner you're using in your cassandra.yaml file (e.g.
> Murmer3, Random, ByteOrdered)?
>
> Also, please have a look at these two issues and verify you're not
> experiencing either:
>
> * https://issues.apache.org/jira/browse/CASSANDRA-7032
> * https://issues.apache.org/jira/browse/CASSANDRA-10430
>
>
> On Wed, Jun 20, 2018 at 9:59 AM, learner dba  invalid> wrote:
>
> Partition key has value as:
>
> MWY4MmI0MTQtYTk2YS00YmRjLTkxND MtOWU0MjM1OWU2NzUy other column is blob.
>
> On Tuesday, June 19, 2018, 6:07:59 PM EDT, Joshua Galbraith <
> jgalbra...@newrelic.com. INVALID> wrote:
>
>
> > id text PRIMARY KEY
>
> What values are written to this id field? Can you give us some examples or
> explain the general use case?
>
> On Tue, Jun 19, 2018 at 1:18 PM, learner dba  invalid > wrote:
>
> Hi Sean,
>
> Here is create table:
>
> CREATE TABLE ks.cf (
>
> id text PRIMARY KEY,
>
> accessdata blob
>
> ) WITH bloom_filter_fp_chance = 0.01
>
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>
> AND comment = ''
>
> AND compaction = {'class': 'org.apache.cassandra.db. compaction.
> SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
>
> AND compression = {'chunk_length_in_kb': '64', 'class': '
> org.apache.cassandra.io. compress.LZ4Compressor'}
>
> AND crc_check_chance = 1.0
>
> AND dclocal_read_repair_chance = 0.1
>
> AND default_time_to_live = 0
>
> AND gc_grace_seconds = 864000
>
> AND max_index_interval = 2048
>
> AND memtable_flush_period_in_ms = 0
>
> AND min_index_interval = 128
>
> AND read_repair_chance = 0.0
>
> AND speculative_retry = '99PERCENTILE';
> Nodetool status:
>
> Datacenter: dc1
>
> ===
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/ Moving
>
> --  Address Load   Tokens   Owns (effective)  Host ID
>   Rack
>
> UN  x   20.66 GiB  256  61.4% f4f54949-83c9-419b-9a43-
> cb630b36d8c2  RAC1
>
> UN  x  65.77 GiB  256  59.3% 3db430ae-45ef-4746-a273-
> bc1f66ac8981  RAC1
>
> UN  xx  60.58 GiB  256  58.4% 1f23e869-1823-4b75-8d3e-
> f9b32acba9a6  RAC1
>
> UN  x  47.08 GiB  256  57.5% 7aca9a36-823f-4185-be44-
> c1464a799084  RAC1
>
> UN  x  51.47 GiB  256  63.4% 18cff010-9b83-4cf8-9dc2-
> f05ac63df402  RAC1
>
> Datacenter: dc2
>
> 
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/ Moving
>
> --  Address Load   Tokens   Owns (effective)  Host ID
>   Rack
>
> UN     24.37 GiB  256   

Network problems during repair make it hang on "Wait for validation to complete"

2018-06-20 Thread Dmitry Simonov
Hello!

Using Cassandra 2.2.11, I observe behaviour, that is very similar to
https://issues.apache.org/jira/browse/CASSANDRA-12860

Steps to reproduce:
1. Set up a cluster: ccm create five -v 2.2.11 && ccm populate -n 5
--vnodes && ccm start
2. Import some keyspace into it (approx 50 Mb of data)
3. Start repair on one node: ccm node2 nodetool repair KEYSPACE
4. While repair is still running, disconnect node3: sudo iptables -I INPUT
-p tcp -d 127.0.0.3 -j DROP
5. This repair hangs.
6. Restore network connectivity
7. Repair is still hanging.
8. Following repairs will also hang.

In tpstats I see tasks that make no progress:

$ for i in {1..5}; do echo node$i; ccm node$i nodetool tpstats | grep
"Repair#"; done
node1
Repair#1  1  2255  1
0 0
node2
Repair#1  1  2335 26
0 0
node3
node4
Repair#3  1   147   2175
0 0
node5
Repair#1  1  2335 17
0 0

In jconsole I see that Repair threads are blocked here:

Name: Repair#1:1
State: WAITING on com.google.common.util.concurrent.AbstractFuture$Sync@73c5ab7e
Total blocked: 0  Total waited: 242

Stack trace:
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:285)
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135)
com.google.common.util.concurrent.Futures.getUnchecked(Futures.java:1371)
org.apache.cassandra.repair.RepairJob.run(RepairJob.java:167)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)


According to the source code, they are waiting for validations to complete:

# 
./apache-cassandra-2.2.8-src/src/java/org/apache/cassandra/repair/RepairJob.java
 74 public void run()
 75 {
...
166 // Wait for validation to complete
167 Futures.getUnchecked(validations);


https://issues.apache.org/jira/browse/CASSANDRA-11824 says that problem was
fixed in 2.2.7, but I use 2.2.11.

Restart of all Cassandra nodes that have hanging tasks (one-by-one) allows
these tasks to disappear from tpstats. After that repairs work well (until
next network problem).

I also suppose that long GC times on one node (as well as network issues)
during repair may also lead to the same problem.

Is it a known issue?

-- 
Best Regards,
Dmitry Simonov


Re: Cassandra Client Program not Working with NettySSLOptions

2018-06-20 Thread Jahar Tyagi
Yes, Server uses the encryption client-node and server-server both and
running fine with JDKSSL options but problem is with NettySSLOptions.

On Tue, Jun 19, 2018 at 7:04 PM, Jonathan Haddad  wrote:

> Is the server configured to use encryption?
>
> On Tue, Jun 19, 2018 at 3:59 AM Jahar Tyagi  wrote:
>
>> Hi,
>>
>> I referred to this link https://docs.datastax.
>> com/en/developer/java-driver/3.0/manual/ssl/
>>   to
>> implement a simple Cassandra client using datastax driver 3.0.0 on SSL with
>> OpenSSL options but unable to run it.
>>
>> Getting generic exception as " 
>> *com.datastax.driver.core.exceptions.NoHostAvailableException"
>> *at line
>> mySession = myCluster.connect();
>>
>> *Code snippet to setup cluster connection is below.*
>>
>> public void connectToCluster()
>> {
>> String[] theCassandraHosts = {"myip"};
>> myCluster =
>> 
>> Cluster.builder().withSSL(getSSLOption()).withReconnectionPolicy(new
>> ConstantReconnectionPolicy(2000)).addContactPoints(
>> theCassandraHosts).withPort(10742)
>> .withCredentials("username", "password").
>> withLoadBalancingPolicy(DCAwareRoundRobinPolicy.builder().build())
>> .withSocketOptions(new SocketOptions().
>> setConnectTimeoutMillis(800).setKeepAlive(true)).build();
>> try {
>> mySession = myCluster.connect();
>> }
>> catch(Exception e) {
>> e.printStackTrace();
>> }
>> System.out.println("Session Established");
>> }
>>
>>
>>  private SSLOptions getSSLOption()
>> {
>> InputStream trustStore = null;
>> try
>> {
>> String theTrustStorePath = "/var/opt/SecureInterface/
>> myTrustStore.jks";
>> String theTrustStorePassword = "mypassword";
>> List theCipherSuites = new ArrayList();
>> theCipherSuites.add("TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384");
>> KeyStore ks = KeyStore.getInstance("JKS");
>> *trustStore = new FileInputStream(theTrustStorePath);*
>> ks.load(trustStore, theTrustStorePassword.toCharArray());
>> TrustManagerFactory tmf = TrustManagerFactory.getInstance(
>> TrustManagerFactory.getDefaultAlgorithm());
>> tmf.init(ks);
>> SslContextBuilder builder =
>> SslContextBuilder.forClient()
>> .sslProvider(SslProvider.OPENSSL)
>> .trustManager(tmf)
>> .ciphers(theCipherSuites)
>> // only if you use client authentication
>> .keyManager(new File("/var/opt/
>> SecureInterface/keystore/Cass.crt"),
>> new File("/var/opt/vs/
>> SecureInterface/keystore/Cass_enc.key"));
>> SSLOptions sslOptions = new NettySSLOptions(builder.build());
>> return sslOptions;
>> }
>> catch (Exception e)
>> {
>> e.printStackTrace();
>> }
>> finally
>> {
>> try
>> {
>> trustStore.close();
>> }
>> catch (IOException e)
>> {
>> e.printStackTrace();
>> }
>> }
>> return null;
>> }
>>
>> Cassandra server is running fine with client and server encryption
>> options. Moreover  I am able to run my client using JdkSSLOptions but have
>> problem with NettySSLOptions.
>>
>> Has anyone implemented the  NettySSLOptions for Cassandra client
>> application?
>>
>>
>> Regards,
>> Jahar Tyagi
>>
> --
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade
>


Re: how to avoid lightwieght transactions

2018-06-20 Thread Rajesh Kishore
Hi,

I think LWT feature is introduced for your kind of usecases only -  you
don't want other requests to be updating the same data at the same time
using Paxos algo(2 Phase commit).
So, IMO your usecase makes perfect sense to use LWT to avoid concurrent
updates.

If your issue is not the concurrent update one then IMHO you may want to
split this in two steps:

- get the transcation_type with quorum factor (or higher consistency level)
-  And conditionally update the row with with quorum factor (or higher
consistency level)

But remember, this wont be atomic in nature and wont solve the concurrent
update issue if you have.


Regards,
Rajesh




On Wed, Jun 20, 2018 at 2:59 AM, manuj singh  wrote:

> Hi all,
> we have a use case where we need to update frequently our rows. Now in
> order to do so and so that we dont override updates we have to resort to
> lightweight transactions.
> Since lightweight is expensive(could be 4 times as expensive as normal
> insert) , how do we model around it.
>
> e.g i have a table where
>
> CREATE TABLE multirow (
>
> id text,
>
> time text,
>
> transcation_type text,
>
> status text,
>
> PRIMARY KEY (id, time)
>
> )
>
>
> So lets say we update status column multiple times. So first time we
> update we also have to make sure that the transaction exists otherwise
> normal update will insert it and then the original insert comes in and it
> will override the update.
>
> So in order to fix that we need to use light weight transactions.
>
>
> Is there another way i can model this so that we can avoid the lightweight
> transactions.
>
>
>
> Thanks
>
>
>