Re: Reg:- Multi DC Configuration

2017-06-08 Thread Justin Cameron
Hi Nandan,

Take a look at the GossipingPropertyFileSnitch:
http://cassandra.apache.org/doc/latest/operating/snitch.html#snitch-classes

You'll also need to configure the cassandra-rackdc.properties file on each
node:
https://github.com/apache/cassandra/blob/trunk/conf/cassandra-rackdc.properties

Cheers,
Justin

On Wed, 7 Jun 2017 at 12:40 @Nandan@  wrote:

> Hi ,
>
> I am trying to Setup Cassandra 3.9 on Multi DC.
> Currently, I am having 2 DCs with 3 and 2 nodes respectively.
>
> DC1 Name :- India
> Nodes :- 192.16.0.1 , 192.16.0.2, 192.16.0.3
> DC2 Name :- USA
> Nodes :- 172.16.0.1 , 172.16.0.2
>
> Please help me to know which files I need to make changes for configuring
> Multi DC successfully.
>
> I am using Ubuntu 16.04 Operating System.
>
> Thanks and Best Regards,
> Nandan Priyadarshi
>
-- 


*Justin Cameron*Senior Software Engineer





This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


Cassandra & Spark

2017-06-08 Thread 한 승호
Hello,

I am Seung-ho and I work as a Data Engineer in Korea. I need some advice.

My company recently consider replacing RDMBS-based system with Cassandra and 
Hadoop.
The purpose of this system is to analyze Cadssandra and HDFS data with Spark.

It seems many user cases put emphasis on data locality, for instance, both 
Cassandra and Spark executor should be on the same node.

The thing is, my company's data analyst team wants to analyze heterogeneous 
data source, Cassandra and HDFS, using Spark.
So, I wonder what would be the best practices of using Cassandra and Hadoop in 
such case.

Plan A: Both HDFS and Cassandra with NodeManager(Spark Executor) on the same 
node

Plan B: Cassandra + Node Manager / HDFS + NodeManager in each node separately 
but the same cluster


Which would be better or correct, or would be a better way?

I appreciate your advice in advance :)

Best Regards,
Seung-Ho Han


Windows 10용 메일에서 보냄



Huge Batches

2017-06-08 Thread techpyaasa .
Hi ,

Recently we are seeing huge batches and log prints as below in c* logs


*Batch of prepared statements for [ks1.cf1] is of size 413350, exceeding
specified threshold of 5120 by 362150*
Along with the Column Family name (as found in above log print) , we would
like to know the partion key , cluster column values(along with their
names) too , so that it would be easy to trace out the user who is
inserting such huge batches.

I tried to see code base of c* as below, but could not figure out how to
get values of partition keys , values of cluster columns. :(
Can some one please help me out...

   * public static void verifyBatchSize(Iterable cfs)*
*{*
*long size = 0;*
*long warnThreshold =
DatabaseDescriptor.getBatchSizeWarnThreshold();*

*for (ColumnFamily cf : cfs)*
*size += cf.dataSize();*

*if (size > warnThreshold)*
*{*
*Set ksCfPairs = new HashSet<>();*
*for (ColumnFamily cf : cfs)*
*{*
*ksCfPairs.add(String.format("%s.%s size=%s",
cf.metadata().ksName, cf.metadata().cfName , cf.dataSize()));*
*Iterator cns = cf.getColumnNames().iterator();*
*CellName cn = cns.next();*
*cn.dataSize();*
*}*

*String format = "Batch of prepared statements for {} is of
size {}, exceeding specified threshold of {} by {}.";*
*logger.warn(format, ksCfPairs, size, warnThreshold, size -
warnThreshold);*
*}*
*}*


Thanks
TechPyaasa


Re: Cassandra & Spark

2017-06-08 Thread Kant Kodali
If you use Containers like Docker Plan A can work provided you do the
resource and capacity planning. I tend to think that Plan B is more
Standard and easier Although you can wait to hear from others for a second
opinion.

Caution: Data Locality will make sense if the Disk throughput is
significantly higher than Network Throughput (Not all, have the same
scenario)


On Thu, Jun 8, 2017 at 1:25 AM, 한 승호  wrote:

> Hello,
>
>
>
> I am Seung-ho and I work as a Data Engineer in Korea. I need some advice.
>
>
>
> My company recently consider replacing RDMBS-based system with Cassandra
> and Hadoop.
>
> The purpose of this system is to analyze Cadssandra and HDFS data with
> Spark.
>
>
>
> It seems many user cases put emphasis on data locality, for instance, both
> Cassandra and Spark executor should be on the same node.
>
>
>
> The thing is, my company's data analyst team wants to analyze
> heterogeneous data source, Cassandra and HDFS, using Spark.
>
> So, I wonder what would be the best practices of using Cassandra and
> Hadoop in such case.
>
>
>
> Plan A: Both HDFS and Cassandra with NodeManager(Spark Executor) on the
> same node
>
>
>
> Plan B: Cassandra + Node Manager / HDFS + NodeManager in each node
> separately but the same cluster
>
>
>
>
>
> Which would be better or correct, or would be a better way?
>
>
>
> I appreciate your advice in advance :)
>
>
>
> Best Regards,
>
> Seung-Ho Han
>
>
>
>
>
> Windows 10용 메일 에서 보냄
>
>
>


Re: Cassandra & Spark

2017-06-08 Thread Tobias Eriksson
Hi
Something to consider before moving to Apache Spark and Cassandra
I have a background where we have tons of data in Cassandra, and we wanted to 
use Apache Spark to run various jobs
We loved what we could do with Spark, BUT….
We realized soon that we wanted to run multiple jobs in parallel
Some jobs would take 30 minutes and some 45 seconds
Spark is by default arranged so that it will take up all the resources there 
is, this can be tweaked by using Mesos or Yarn
But even with Mesos and Yarn we found it complicated to run multiple jobs in 
parallel.
So eventually we ended up throwing out Spark,
Instead we transferred the data to Apache Kudu, and then we ran our analysis on 
Kudu, and what a difference !
“my two cents!”
-Tobias



From: 한 승호 
Date: Thursday, 8 June 2017 at 10:25
To: "user@cassandra.apache.org" 
Subject: Cassandra & Spark

Hello,

I am Seung-ho and I work as a Data Engineer in Korea. I need some advice.

My company recently consider replacing RDMBS-based system with Cassandra and 
Hadoop.
The purpose of this system is to analyze Cadssandra and HDFS data with Spark.

It seems many user cases put emphasis on data locality, for instance, both 
Cassandra and Spark executor should be on the same node.

The thing is, my company's data analyst team wants to analyze heterogeneous 
data source, Cassandra and HDFS, using Spark.
So, I wonder what would be the best practices of using Cassandra and Hadoop in 
such case.

Plan A: Both HDFS and Cassandra with NodeManager(Spark Executor) on the same 
node

Plan B: Cassandra + Node Manager / HDFS + NodeManager in each node separately 
but the same cluster


Which would be better or correct, or would be a better way?

I appreciate your advice in advance :)

Best Regards,
Seung-Ho Han


Windows 10용 메일에서 보냄



Re: Cassandra & Spark

2017-06-08 Thread DuyHai Doan
Interesting

Tobias, when you said "Instead we transferred the data to Apache Kudu", did
you transfer all Cassandra data into Kudu from with a single migration and
then tap into Kudo for aggregation or did you run data import every
day/week/month from Cassandra into Kudu ?

>From my point of view, the difficulty is not to have a static set of data
and run aggregation on it, there are a lot of alternatives out there. The
difficulty is to be able to run analytics on a live/production/changing
dataset with all the data movement & update that it implies.

Regards

On Thu, Jun 8, 2017 at 3:37 PM, Tobias Eriksson  wrote:

> Hi
>
> Something to consider before moving to Apache Spark and Cassandra
>
> I have a background where we have tons of data in Cassandra, and we wanted
> to use Apache Spark to run various jobs
>
> We loved what we could do with Spark, BUT….
>
> We realized soon that we wanted to run multiple jobs in parallel
>
> Some jobs would take 30 minutes and some 45 seconds
>
> Spark is by default arranged so that it will take up all the resources
> there is, this can be tweaked by using Mesos or Yarn
>
> But even with Mesos and Yarn we found it complicated to run multiple jobs
> in parallel.
>
> So eventually we ended up throwing out Spark,
>
> Instead we transferred the data to Apache Kudu, and then we ran our
> analysis on Kudu, and what a difference !
>
> “my two cents!”
>
> -Tobias
>
>
>
>
>
>
>
> *From: *한 승호 
> *Date: *Thursday, 8 June 2017 at 10:25
> *To: *"user@cassandra.apache.org" 
> *Subject: *Cassandra & Spark
>
>
>
> Hello,
>
>
>
> I am Seung-ho and I work as a Data Engineer in Korea. I need some advice.
>
>
>
> My company recently consider replacing RDMBS-based system with Cassandra
> and Hadoop.
>
> The purpose of this system is to analyze Cadssandra and HDFS data with
> Spark.
>
>
>
> It seems many user cases put emphasis on data locality, for instance, both
> Cassandra and Spark executor should be on the same node.
>
>
>
> The thing is, my company's data analyst team wants to analyze
> heterogeneous data source, Cassandra and HDFS, using Spark.
>
> So, I wonder what would be the best practices of using Cassandra and
> Hadoop in such case.
>
>
>
> Plan A: Both HDFS and Cassandra with NodeManager(Spark Executor) on the
> same node
>
>
>
> Plan B: Cassandra + Node Manager / HDFS + NodeManager in each node
> separately but the same cluster
>
>
>
>
>
> Which would be better or correct, or would be a better way?
>
>
>
> I appreciate your advice in advance :)
>
>
>
> Best Regards,
>
> Seung-Ho Han
>
>
>
>
>
> Windows 10용 메일 에서 보냄
>
>
>


Re: Cassandra & Spark

2017-06-08 Thread Tobias Eriksson
Hi
What I wanted was a dashboard with graphs/diagrams and it should not take 
minutes for the page to load
Thus, it was a problem to have Spark with Cassandra, and not solving the 
parallelization to such an extent that I could have the diagrams rendered in 
seconds.
Now with Kudu we get some decent results rendering the diagrams/graphs

The way we transfer data from Cassandra which is the Production system storage 
to Kudu, is through an Apache Kafka topic (or many topics actually) and then we 
have an application which ingests the data into Kudu


Other Systems -- > Domain Storage App(s) -- > Cassandra -- > KAFKA -- > 
KuduIngestion App -- > Kudu < -- Dashboard App(s)


If you want to play with really fast analytics then perhaps consider looking at 
Apache Ignite
https://ignite.apache.org
Which then act as a layer between Cassandra and your applications storing into 
Cassandra (memory datagrid I think it is called)
Basically, think of it as a big cache
It is an in-memory thingi ☺
And then you can run some super fast queries

-Tobias

From: DuyHai Doan 
Date: Thursday, 8 June 2017 at 15:42
To: Tobias Eriksson 
Cc: 한 승호 , "user@cassandra.apache.org" 

Subject: Re: Cassandra & Spark

Interesting

Tobias, when you said "Instead we transferred the data to Apache Kudu", did you 
transfer all Cassandra data into Kudu from with a single migration and then tap 
into Kudo for aggregation or did you run data import every day/week/month from 
Cassandra into Kudu ?

From my point of view, the difficulty is not to have a static set of data and 
run aggregation on it, there are a lot of alternatives out there. The 
difficulty is to be able to run analytics on a live/production/changing dataset 
with all the data movement & update that it implies.

Regards

On Thu, Jun 8, 2017 at 3:37 PM, Tobias Eriksson 
mailto:tobias.eriks...@qvantel.com>> wrote:
Hi
Something to consider before moving to Apache Spark and Cassandra
I have a background where we have tons of data in Cassandra, and we wanted to 
use Apache Spark to run various jobs
We loved what we could do with Spark, BUT….
We realized soon that we wanted to run multiple jobs in parallel
Some jobs would take 30 minutes and some 45 seconds
Spark is by default arranged so that it will take up all the resources there 
is, this can be tweaked by using Mesos or Yarn
But even with Mesos and Yarn we found it complicated to run multiple jobs in 
parallel.
So eventually we ended up throwing out Spark,
Instead we transferred the data to Apache Kudu, and then we ran our analysis on 
Kudu, and what a difference !
“my two cents!”
-Tobias



From: 한 승호 mailto:shha...@outlook.com>>
Date: Thursday, 8 June 2017 at 10:25
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Cassandra & Spark

Hello,

I am Seung-ho and I work as a Data Engineer in Korea. I need some advice.

My company recently consider replacing RDMBS-based system with Cassandra and 
Hadoop.
The purpose of this system is to analyze Cadssandra and HDFS data with Spark.

It seems many user cases put emphasis on data locality, for instance, both 
Cassandra and Spark executor should be on the same node.

The thing is, my company's data analyst team wants to analyze heterogeneous 
data source, Cassandra and HDFS, using Spark.
So, I wonder what would be the best practices of using Cassandra and Hadoop in 
such case.

Plan A: Both HDFS and Cassandra with NodeManager(Spark Executor) on the same 
node

Plan B: Cassandra + Node Manager / HDFS + NodeManager in each node separately 
but the same cluster


Which would be better or correct, or would be a better way?

I appreciate your advice in advance :)

Best Regards,
Seung-Ho Han


Windows 10용 메일에서 보냄




RE: Convert single node C* to cluster (rebalancing problem)

2017-06-08 Thread ZAIDI, ASAD A
Did you make sure auto_bootstrap property is indeed set to [true] when you 
added the node?

From: Junaid Nasir [mailto:jna...@an10.io]
Sent: Monday, June 05, 2017 6:29 AM
To: Akhil Mehra 
Cc: Vladimir Yudovin ; user@cassandra.apache.org
Subject: Re: Convert single node C* to cluster (rebalancing problem)

not evenly, i have setup a new cluster with subset of data (around 5gb). using 
the configuration above I am getting these results


Datacenter: datacenter1

===

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address  Load   Tokens   Owns (effective)  Host ID Rack

UN  10.128.2.1   4.86 GiB   256  44.9% 
e4427611-c247-42ee-9404-371e177f5f17  rack1

UN  10.128.2.10  725.03 MiB  256 55.1% 
690d5620-99d3-4ae3-aebe-8f33af54a08b  rack1
is there anything else I can tweak/check to make the distribution even?

On Sat, Jun 3, 2017 at 3:30 AM, Akhil Mehra 
mailto:akhilme...@gmail.com>> wrote:
So now the data is evenly balanced in both nodes?

Refer to the following documentation to get a better understanding of the 
roc_address and the broadcast_rpc_address 
https://www.instaclustr.com/demystifying-cassandras-broadcast_address/.
 I am surprised that your node started up with rpc_broadcast_address set as 
this is an unsupported property. I am assuming you are using Cassandra version 
3.10.


Regards,
Akhil

On 2/06/2017, at 11:06 PM, Junaid Nasir mailto:jna...@an10.io>> 
wrote:

I am able to get it working. I added a new node with following changes

#rpc_address:0.0.0.0

rpc_address: 10.128.1.11

#rpc_broadcast_address:10.128.1.11
rpc_address was set to 0.0.0.0, (I ran into a problem previously regarding 
remote connection and made these changes 
https://stackoverflow.com/questions/12236898/apache-cassandra-remote-access)

should it be happening?

On Thu, Jun 1, 2017 at 6:31 PM, Vladimir Yudovin 
mailto:vla...@winguzone.com>> wrote:
Did you run "nodetool cleanup" on first node after second was bootstrapped? It 
should clean rows not belonging to node after tokens changed.

Best regards, Vladimir Yudovin,
Winguzone
 - Cloud Cassandra Hosting


 On Wed, 31 May 2017 03:55:54 -0400 Junaid Nasir 
mailto:jna...@an10.io>> wrote 

Cassandra ensure that adding or removing nodes are very easy and that load is 
balanced between nodes when a change is made. but it's not working in my case.
I have a single node C* deployment (with 270 GB of data) and want to load 
balance the data on multiple nodes, I followed this 
guide
`nodetool status` shows 2 nodes but load is not balanced between them

Datacenter: dc1

===

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address  Load   Tokens   Owns (effective)  Host IDRack

UN  10.128.0.7   270.75 GiB  256  48.6%
1a3f6faa-4376-45a8-9c20-11480ae5664c  rack1

UN  10.128.0.14  414.36 KiB  256  51.4%
66a89fbf-08ba-4b5d-9f10-55d52a199b41  rack1
I also ran 'nodetool repair' on new node but result is same. any pointers would 
be appreciated :)

conf file of new node

cluster_name: 'cluster1'

 - seeds: "10.128.0.7"
num_tokens: 256

endpoint_snitch: GossipingPropertyFileSnitch
Thanks,
Junaid






RE: Local_serial >> Adding nodes

2017-06-08 Thread ZAIDI, ASAD A
Please share exact timeout error message text to get idea what type of timeout 
you're facing.


From: Nitan Kainth [mailto:ni...@bamlabs.com]
Sent: Wednesday, June 07, 2017 7:24 AM
To: vasu gunja 
Cc: user@cassandra.apache.org
Subject: Re: Local_serial >> Adding nodes

What is in system log?
Does it show new tokens allocated?
nodetool netstats

On Jun 7, 2017, at 6:50 AM, vasu gunja 
mailto:vasu.no...@gmail.com>> wrote:

Yeah all of them are in UJ state..

Thanks,
Vasu

On Jun 7, 2017, at 6:32 AM, Nitan Kainth 
mailto:ni...@bamlabs.com>> wrote:
Is the streaming still going on?
Timeouts are in reads or writes?

Sent from my iPhone

On Jun 6, 2017, at 9:50 PM, vasu gunja 
mailto:vasu.no...@gmail.com>> wrote:
Hi All,

We are having 2 DC setup each consists of 20 odd nodes and recently we decided 
to add 6 more nodes to DC1.  We are using LWT's,  application dirvers are 
configuared to use LOCAL_SERIAL.
As we are adding multiple nodes at a time we used option  
"-Dcassandra.consistent.rangemovement=false" we added all nodes with gap of 10 
mins each

We are facing lot of timeouts more 30k transactions over 8 hours of period . is 
anyone ran into same issue ? are we doing something.wrong ?



Thanks,
vasu




RE: Data in multi disks is not evenly distributed

2017-06-08 Thread ZAIDI, ASAD A
Check status of load with nodetool status command. Make sure your there isn’t 
huge number of pending compactions for your tables. Ideally speaking data 
distribution should be even across your nodes.

you should have reserved extra 15% of free space relative to your maximum size 
of your table i.e. candidate for compaction so compaction goes comfortably. 
Some documents suggest you should reserve 50% of free space for worst case 
scenario though I think that is bit aggressive#.




From: Xihui He [mailto:xihu...@gmail.com]
Sent: Wednesday, June 07, 2017 5:16 AM
To: user@cassandra.apache.org
Subject: Data in multi disks is not evenly distributed

Dear All,

We are using multiple disks per node and find the data is not evenly 
distributed (data01 uses 1.1T, but data02 uses 353G). Is this expected? If 
data01 becomes full, would the node be still writable? We are using 2.2.6.

Thanks,
Xihui

data_file_directories:
- /data00/cassandra
- /data01/cassandra
- /data02/cassandra
- /data03/cassandra
- /data04/cassandra

df
/dev/sde1   1.8T  544G  1.2T  32% /data03
/dev/sdc1   1.8T  1.1T  683G  61% /data01
/dev/sdf1   1.8T  491G  1.3T  29% /data04
/dev/sdd1   1.8T  353G  1.4T  21% /data02
/dev/sdb1   1.8T  285G  1.5T  17% /data00

root@n9-016-015:~# du -sh /data01/cassandra/album_media_feature/*
143M   
/data01/cassandra/album_media_feature/media_feature_blur-066e5700c41511e5beacf197ae340934
4.4G
/data01/cassandra/album_media_feature/media_feature_c1-dbadf930c41411e5974743d3a691d887
56K 
/data01/cassandra/album_media_feature/media_feature_duplicate-09d4b380c41511e58501e9aa37be91a5
16K 
/data01/cassandra/album_media_feature/media_feature_emotion-b8570470054d11e69fb88f073bab8267
240M   
/data01/cassandra/album_media_feature/media_feature_exposure-f55449c0c41411e58f5c9b66773b60c3
649M   
/data01/cassandra/album_media_feature/media_feature_group-f8de0cc0c41411e5827b995f709095c8
22G 
/data01/cassandra/album_media_feature/media_feature_multi_class-cf3bb72006c511e69fb88f073bab8267
44K 
/data01/cassandra/album_media_feature/media_feature_pool5-1185b200c41511e5b7d8757e25e34d67
15G 
/data01/cassandra/album_media_feature/media_feature_poster-fcf45850c41411e597bb1507d1856305
8.0K
/data01/cassandra/album_media_feature/media_feature_quality-155d9500c41511e5974743d3a691d887
17G 
/data01/cassandra/album_media_feature/media_feature_quality_rc-51babf50dba811e59fb88f073bab8267
8.7G
/data01/cassandra/album_media_feature/media_feature_scene-008a5050c41511e59ebcc3582d286c8d
8.0K
/data01/cassandra/album_media_feature/media_region_features_v4-29a0cd10150611e6bd3e3f41faa2612a
971G   
/data01/cassandra/album_media_feature/media_region_features_v5-1b805470a3d711e68121757e9ac51b7b

root@n9-016-015:~# du -sh /data02/cassandra/album_media_feature/*
1.6G
/data02/cassandra/album_media_feature/media_feature_blur-066e5700c41511e5beacf197ae340934
44G 
/data02/cassandra/album_media_feature/media_feature_c1-dbadf930c41411e5974743d3a691d887
64K 
/data02/cassandra/album_media_feature/media_feature_duplicate-09d4b380c41511e58501e9aa37be91a5
75G 
/data02/cassandra/album_media_feature/media_feature_emotion-b8570470054d11e69fb88f073bab8267
2.0G
/data02/cassandra/album_media_feature/media_feature_exposure-f55449c0c41411e58f5c9b66773b60c3
21G 
/data02/cassandra/album_media_feature/media_feature_group-f8de0cc0c41411e5827b995f709095c8
336M   
/data02/cassandra/album_media_feature/media_feature_multi_class-cf3bb72006c511e69fb88f073bab8267
44K 
/data02/cassandra/album_media_feature/media_feature_pool5-1185b200c41511e5b7d8757e25e34d67
2.0G
/data02/cassandra/album_media_feature/media_feature_poster-fcf45850c41411e597bb1507d1856305
8.0K
/data02/cassandra/album_media_feature/media_feature_quality-155d9500c41511e5974743d3a691d887
17G 
/data02/cassandra/album_media_feature/media_feature_quality_rc-51babf50dba811e59fb88f073bab8267
141M   
/data02/cassandra/album_media_feature/media_feature_scene-008a5050c41511e59ebcc3582d286c8d
8.0K
/data02/cassandra/album_media_feature/media_region_features_v4-29a0cd10150611e6bd3e3f41faa2612a
93G 
/data02/cassandra/album_media_feature/media_region_features_v5-1b805470a3d711e68121757e9ac51b7b

root@n9-016-015:~# du -sh /data03/cassandra/album_media_feature/*
4.3G
/data03/cassandra/album_media_feature/media_feature_blur-066e5700c41511e5beacf197ae340934
19G 
/data03/cassandra/album_media_feature/media_feature_c1-dbadf930c41411e5974743d3a691d887
72K 
/data03/cassandra/album_media_feature/media_feature_duplicate-09d4b380c41511e58501e9aa37be91a5
2.8G
/data03/cassandra/album_media_feature/media_feature_emotion-b8570470054d11e69fb88f073bab8267
105M   
/data03/cassandra/album_media_feature/media_feature_exposure-f55449c0c41411e58f5c9b66773b60c3
15G 
/data03/cassandra/album_media_feature/media_feature_group-f8de0cc0c41411e5827b995f709095c8
23G 
/data03/cassandra/album_media_feature/media_feature_m

Re: Huge Batches

2017-06-08 Thread Justin Cameron
I don't believe the keys within a large batch are logged by Cassandra. A
large batch could potentially contain tens of thousands of primary keys, so
this could quickly fill up the logs.

Here are a couple of suggestions:

   - Large batches should also be slow, so you could try setting up slow
   query logging in the Java driver and see what gets caught:
   https://docs.datastax.com/en/developer/java-driver/3.2/manual/logging/
   - You could write your own custom QueryHandler to log those details on
   the server-side, as described here:
   
https://www.slideshare.net/planetcassandra/cassandra-summit-2014-lesser-known-features-of-cassandra-21


Cheers,
Justin

On Thu, 8 Jun 2017 at 18:49 techpyaasa .  wrote:

> Hi ,
>
> Recently we are seeing huge batches and log prints as below in c* logs
>
>
> *Batch of prepared statements for [ks1.cf1] is of size 413350, exceeding
> specified threshold of 5120 by 362150*
> Along with the Column Family name (as found in above log print) , we would
> like to know the partion key , cluster column values(along with their
> names) too , so that it would be easy to trace out the user who is
> inserting such huge batches.
>
> I tried to see code base of c* as below, but could not figure out how to
> get values of partition keys , values of cluster columns. :(
> Can some one please help me out...
>
>* public static void verifyBatchSize(Iterable cfs)*
> *{*
> *long size = 0;*
> *long warnThreshold =
> DatabaseDescriptor.getBatchSizeWarnThreshold();*
>
> *for (ColumnFamily cf : cfs)*
> *size += cf.dataSize();*
>
> *if (size > warnThreshold)*
> *{*
> *Set ksCfPairs = new HashSet<>();*
> *for (ColumnFamily cf : cfs)*
> *{*
> *ksCfPairs.add(String.format("%s.%s size=%s",
> cf.metadata().ksName, cf.metadata().cfName , cf.dataSize()));*
> *Iterator cns = cf.getColumnNames().iterator();*
> *CellName cn = cns.next();*
> *cn.dataSize();*
> *}*
>
> *String format = "Batch of prepared statements for {} is of
> size {}, exceeding specified threshold of {} by {}.";*
> *logger.warn(format, ksCfPairs, size, warnThreshold, size -
> warnThreshold);*
> *}*
> *}*
>
>
> Thanks
>
> TechPyaasa
>
-- 


*Justin Cameron*Senior Software Engineer





This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


Reg:- Data Modelling For Hierarchy Data

2017-06-08 Thread @Nandan@
Hi,

I am working on Music database where we have multiple order of users of our
portal. Different category of users is having some common attributes but
some different attributes based on their registration.
This becomes a hierarchy pattern. I am attaching one sample hierarchy
pattern of User Module which is somehow part of my current data modeling.

*There are few conditions:-*
*1) email id should be unique. i.e If some user registered with one email
id then that particular user can't able to register as another user. *
*2) Some type of users having 20-30 columns as in their registration. such
as company,address,email,first_name,join_date etc..*

*Query pattern is like:-*
*1) select user by email*

Please suggest me how to do data modeling for these type of hierarchy data.
Should I create a seperate table for the seperate type of users or should I
go with single user table?
As we have unique email id condition, so should I go with email id as a
primary key or user_id UUID will be the best choice.



Best regards,
Nandan Priyadarshi

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Definition of QUORUM consistency level

2017-06-08 Thread Dikang Gu
Hello there,

We have some use cases are doing consistent read/write requests, and we
have 4 replicas in that cluster, according to our setup.

What's interesting to me is that, for both read and write quorum requests,
they are blocked for 4/2+1 = 3 replicas, so we are accessing 3 (for write)
+ 3 (for reads) = 6 replicas in quorum requests, which is 2 replicas more
than 4.

I think it's not necessary to have 2 overlap nodes in even replication
factor case.

I suggest to change the `quorumFor(keyspace)` code, separate the case for
read and write requests, so that we can reduce one replica request in read
path.

Any concerns?

Thanks!


-- 
Dikang


Re: Definition of QUORUM consistency level

2017-06-08 Thread Justin Cameron
2/4 for write and 2/4 for read would not be sufficient to achieve strong
consistency, as there is no overlap.

In your particular case you could potentially use QUORUM for write and TWO
for read (or vice-versa) and still achieve strong consistency. If you add
additional nodes in the future this would obviously no longer work. Also
the benefit of this is dubious, since 3/4 nodes still need to be accessible
to perform writes. I'd also guess that it's unlikely to provide any
significant performance increase.

Justin

On Fri, 9 Jun 2017 at 12:29 Dikang Gu  wrote:

> Hello there,
>
> We have some use cases are doing consistent read/write requests, and we
> have 4 replicas in that cluster, according to our setup.
>
> What's interesting to me is that, for both read and write quorum requests,
> they are blocked for 4/2+1 = 3 replicas, so we are accessing 3 (for write)
> + 3 (for reads) = 6 replicas in quorum requests, which is 2 replicas more
> than 4.
>
> I think it's not necessary to have 2 overlap nodes in even replication
> factor case.
>
> I suggest to change the `quorumFor(keyspace)` code, separate the case for
> read and write requests, so that we can reduce one replica request in read
> path.
>
> Any concerns?
>
> Thanks!
>
>
> --
> Dikang
>
> --


*Justin Cameron*Senior Software Engineer





This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


Re: Definition of QUORUM consistency level

2017-06-08 Thread Dikang Gu
Justin, what I suggest is that for QUORUM consistent level, the block for
write should be (num_replica/2)+1, this is same as today, but for read
request, we just need to access (num_replica/2) nodes, which should provide
enough strong consistency.

Dikang.

On Thu, Jun 8, 2017 at 7:38 PM, Justin Cameron 
wrote:

> 2/4 for write and 2/4 for read would not be sufficient to achieve strong
> consistency, as there is no overlap.
>
> In your particular case you could potentially use QUORUM for write and TWO
> for read (or vice-versa) and still achieve strong consistency. If you add
> additional nodes in the future this would obviously no longer work. Also
> the benefit of this is dubious, since 3/4 nodes still need to be accessible
> to perform writes. I'd also guess that it's unlikely to provide any
> significant performance increase.
>
> Justin
>
> On Fri, 9 Jun 2017 at 12:29 Dikang Gu  wrote:
>
>> Hello there,
>>
>> We have some use cases are doing consistent read/write requests, and we
>> have 4 replicas in that cluster, according to our setup.
>>
>> What's interesting to me is that, for both read and write quorum
>> requests, they are blocked for 4/2+1 = 3 replicas, so we are accessing 3
>> (for write) + 3 (for reads) = 6 replicas in quorum requests, which is 2
>> replicas more than 4.
>>
>> I think it's not necessary to have 2 overlap nodes in even replication
>> factor case.
>>
>> I suggest to change the `quorumFor(keyspace)` code, separate the case for
>> read and write requests, so that we can reduce one replica request in read
>> path.
>>
>> Any concerns?
>>
>> Thanks!
>>
>>
>> --
>> Dikang
>>
>> --
>
>
> *Justin Cameron*Senior Software Engineer
>
>
> 
>
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>



-- 
Dikang


Re: Definition of QUORUM consistency level

2017-06-08 Thread Jonathan Haddad
It would be a little weird to change the definition of QUORUM, which means
majority, to mean something other than majority for a single use case.
Sounds like you want to introduce a new CL, HALF.
On Thu, Jun 8, 2017 at 7:43 PM Dikang Gu  wrote:

> Justin, what I suggest is that for QUORUM consistent level, the block for
> write should be (num_replica/2)+1, this is same as today, but for read
> request, we just need to access (num_replica/2) nodes, which should provide
> enough strong consistency.
>
> Dikang.
>
> On Thu, Jun 8, 2017 at 7:38 PM, Justin Cameron 
> wrote:
>
>> 2/4 for write and 2/4 for read would not be sufficient to achieve strong
>> consistency, as there is no overlap.
>>
>> In your particular case you could potentially use QUORUM for write and
>> TWO for read (or vice-versa) and still achieve strong consistency. If you
>> add additional nodes in the future this would obviously no longer work.
>> Also the benefit of this is dubious, since 3/4 nodes still need to be
>> accessible to perform writes. I'd also guess that it's unlikely to provide
>> any significant performance increase.
>>
>> Justin
>>
>> On Fri, 9 Jun 2017 at 12:29 Dikang Gu  wrote:
>>
>>> Hello there,
>>>
>>> We have some use cases are doing consistent read/write requests, and we
>>> have 4 replicas in that cluster, according to our setup.
>>>
>>> What's interesting to me is that, for both read and write quorum
>>> requests, they are blocked for 4/2+1 = 3 replicas, so we are accessing 3
>>> (for write) + 3 (for reads) = 6 replicas in quorum requests, which is 2
>>> replicas more than 4.
>>>
>>> I think it's not necessary to have 2 overlap nodes in even replication
>>> factor case.
>>>
>>> I suggest to change the `quorumFor(keyspace)` code, separate the case
>>> for read and write requests, so that we can reduce one replica request in
>>> read path.
>>>
>>> Any concerns?
>>>
>>> Thanks!
>>>
>>>
>>> --
>>> Dikang
>>>
>>> --
>>
>>
>> *Justin Cameron*Senior Software Engineer
>>
>>
>> 
>>
>>
>> This email has been sent on behalf of Instaclustr Pty. Limited
>> (Australia) and Instaclustr Inc (USA).
>>
>> This email and any attachments may contain confidential and legally
>> privileged information.  If you are not the intended recipient, do not copy
>> or disclose its content, but please reply to this email immediately and
>> highlight the error to the sender and then immediately delete the message.
>>
>
>
>
> --
> Dikang
>
>


Re: Definition of QUORUM consistency level

2017-06-08 Thread Dikang Gu
So, for the quorum, what we really want is that there is one overlap among
the nodes in write path and read path. It actually was my assumption for a
long time that we need (N/2 + 1) for write and just need (N/2) for read,
because it's enough to provide the strong consistency.

On Thu, Jun 8, 2017 at 7:47 PM, Jonathan Haddad  wrote:

> It would be a little weird to change the definition of QUORUM, which means
> majority, to mean something other than majority for a single use case.
> Sounds like you want to introduce a new CL, HALF.
> On Thu, Jun 8, 2017 at 7:43 PM Dikang Gu  wrote:
>
>> Justin, what I suggest is that for QUORUM consistent level, the block for
>> write should be (num_replica/2)+1, this is same as today, but for read
>> request, we just need to access (num_replica/2) nodes, which should provide
>> enough strong consistency.
>>
>> Dikang.
>>
>> On Thu, Jun 8, 2017 at 7:38 PM, Justin Cameron 
>> wrote:
>>
>>> 2/4 for write and 2/4 for read would not be sufficient to achieve strong
>>> consistency, as there is no overlap.
>>>
>>> In your particular case you could potentially use QUORUM for write and
>>> TWO for read (or vice-versa) and still achieve strong consistency. If you
>>> add additional nodes in the future this would obviously no longer work.
>>> Also the benefit of this is dubious, since 3/4 nodes still need to be
>>> accessible to perform writes. I'd also guess that it's unlikely to provide
>>> any significant performance increase.
>>>
>>> Justin
>>>
>>> On Fri, 9 Jun 2017 at 12:29 Dikang Gu  wrote:
>>>
 Hello there,

 We have some use cases are doing consistent read/write requests, and we
 have 4 replicas in that cluster, according to our setup.

 What's interesting to me is that, for both read and write quorum
 requests, they are blocked for 4/2+1 = 3 replicas, so we are accessing 3
 (for write) + 3 (for reads) = 6 replicas in quorum requests, which is 2
 replicas more than 4.

 I think it's not necessary to have 2 overlap nodes in even replication
 factor case.

 I suggest to change the `quorumFor(keyspace)` code, separate the case
 for read and write requests, so that we can reduce one replica request in
 read path.

 Any concerns?

 Thanks!


 --
 Dikang

 --
>>>
>>>
>>> *Justin Cameron*Senior Software Engineer
>>>
>>>
>>> 
>>>
>>>
>>> This email has been sent on behalf of Instaclustr Pty. Limited
>>> (Australia) and Instaclustr Inc (USA).
>>>
>>> This email and any attachments may contain confidential and legally
>>> privileged information.  If you are not the intended recipient, do not copy
>>> or disclose its content, but please reply to this email immediately and
>>> highlight the error to the sender and then immediately delete the message.
>>>
>>
>>
>>
>> --
>> Dikang
>>
>>


-- 
Dikang


Re: Definition of QUORUM consistency level

2017-06-08 Thread Nate McCall
> So, for the quorum, what we really want is that there is one overlap among
> the nodes in write path and read path. It actually was my assumption for a
> long time that we need (N/2 + 1) for write and just need (N/2) for read,
> because it's enough to provide the strong consistency.
>

You are write about strong consistency with that calculation, but if I want
to issue a QUORUM read just by itself, I would expect a majority of nodes
to reply. How it was written might be immaterial to my use case of reading
'from a majority.'

-- 
-
Nate McCall
Wellington, NZ
@zznate

CTO
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: Definition of QUORUM consistency level

2017-06-08 Thread Nate McCall
>
>
> So, for the quorum, what we really want is that there is one overlap among
>> the nodes in write path and read path. It actually was my assumption for a
>> long time that we need (N/2 + 1) for write and just need (N/2) for read,
>> because it's enough to provide the strong consistency.
>>
>
> You are write about ...
>
*right (lol!).


Re: Definition of QUORUM consistency level

2017-06-08 Thread Brandon Williams
We have CL.TWO.

On Thu, Jun 8, 2017 at 10:03 PM, Dikang Gu  wrote:

> So, for the quorum, what we really want is that there is one overlap among
> the nodes in write path and read path. It actually was my assumption for a
> long time that we need (N/2 + 1) for write and just need (N/2) for read,
> because it's enough to provide the strong consistency.
>
> On Thu, Jun 8, 2017 at 7:47 PM, Jonathan Haddad  wrote:
>
>> It would be a little weird to change the definition of QUORUM, which
>> means majority, to mean something other than majority for a single use
>> case. Sounds like you want to introduce a new CL, HALF.
>> On Thu, Jun 8, 2017 at 7:43 PM Dikang Gu  wrote:
>>
>>> Justin, what I suggest is that for QUORUM consistent level, the block
>>> for write should be (num_replica/2)+1, this is same as today, but for read
>>> request, we just need to access (num_replica/2) nodes, which should provide
>>> enough strong consistency.
>>>
>>> Dikang.
>>>
>>> On Thu, Jun 8, 2017 at 7:38 PM, Justin Cameron 
>>> wrote:
>>>
 2/4 for write and 2/4 for read would not be sufficient to achieve
 strong consistency, as there is no overlap.

 In your particular case you could potentially use QUORUM for write and
 TWO for read (or vice-versa) and still achieve strong consistency. If you
 add additional nodes in the future this would obviously no longer work.
 Also the benefit of this is dubious, since 3/4 nodes still need to be
 accessible to perform writes. I'd also guess that it's unlikely to provide
 any significant performance increase.

 Justin

 On Fri, 9 Jun 2017 at 12:29 Dikang Gu  wrote:

> Hello there,
>
> We have some use cases are doing consistent read/write requests, and
> we have 4 replicas in that cluster, according to our setup.
>
> What's interesting to me is that, for both read and write quorum
> requests, they are blocked for 4/2+1 = 3 replicas, so we are accessing 3
> (for write) + 3 (for reads) = 6 replicas in quorum requests, which is 2
> replicas more than 4.
>
> I think it's not necessary to have 2 overlap nodes in even replication
> factor case.
>
> I suggest to change the `quorumFor(keyspace)` code, separate the case
> for read and write requests, so that we can reduce one replica request in
> read path.
>
> Any concerns?
>
> Thanks!
>
>
> --
> Dikang
>
> --


 *Justin Cameron*Senior Software Engineer


 


 This email has been sent on behalf of Instaclustr Pty. Limited
 (Australia) and Instaclustr Inc (USA).

 This email and any attachments may contain confidential and legally
 privileged information.  If you are not the intended recipient, do not copy
 or disclose its content, but please reply to this email immediately and
 highlight the error to the sender and then immediately delete the message.

>>>
>>>
>>>
>>> --
>>> Dikang
>>>
>>>
>
>
> --
> Dikang
>
>


Re: Definition of QUORUM consistency level

2017-06-08 Thread Nate McCall
> We have CL.TWO.
>
>
>
This was actually the original motivation for CL.TWO and CL.THREE if memory
serves:
https://issues.apache.org/jira/browse/CASSANDRA-2013


Re: Definition of QUORUM consistency level

2017-06-08 Thread Dikang Gu
To me, CL.TWO and CL.THREE are more like work around of the problem, for
example, they do not work if the number of replicas go to 8, which does
possible in our environment (2 replicas in each of 4 DCs).

What people want from quorum is strong consistency guarantee, as long as
R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2), W=(n/2+1); c)
R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a), which
is the most expensive option.

I can not think of a reason, that people want the quorum read, not for
strong consistency reason, but just to read from (n/2+1) nodes. If they
want strong consistency, then the read just needs (n/2) nodes, we are
purely waste the one extra request, and hurts read latency as well.

Thanks
Dikang.

On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall  wrote:

>
> We have CL.TWO.
>>
>>
>>
> This was actually the original motivation for CL.TWO and CL.THREE if
> memory serves:
> https://issues.apache.org/jira/browse/CASSANDRA-2013
>



-- 
Dikang


Re: Definition of QUORUM consistency level

2017-06-08 Thread Brandon Williams
I don't disagree with you there and have never liked TWO/THREE.  This is
somewhat relevant: https://issues.apache.org/jira/browse/CASSANDRA-2338

I don't think going to CL.FOUR, etc, is a good long-term solution, but I'm
also not sure what is.


On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu  wrote:

> To me, CL.TWO and CL.THREE are more like work around of the problem, for
> example, they do not work if the number of replicas go to 8, which does
> possible in our environment (2 replicas in each of 4 DCs).
>
> What people want from quorum is strong consistency guarantee, as long as
> R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2), W=(n/2+1); c)
> R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a), which
> is the most expensive option.
>
> I can not think of a reason, that people want the quorum read, not for
> strong consistency reason, but just to read from (n/2+1) nodes. If they
> want strong consistency, then the read just needs (n/2) nodes, we are
> purely waste the one extra request, and hurts read latency as well.
>
> Thanks
> Dikang.
>
> On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall 
> wrote:
>
>>
>> We have CL.TWO.
>>>
>>>
>>>
>> This was actually the original motivation for CL.TWO and CL.THREE if
>> memory serves:
>> https://issues.apache.org/jira/browse/CASSANDRA-2013
>>
>
>
>
> --
> Dikang
>
>


Re: Definition of QUORUM consistency level

2017-06-08 Thread Jeff Jirsa
Would love to see real pluggable consistency levels. Sorta sad it got
wont-fixed - may be time to revisit that, perhaps it's more feasible now.

https://issues.apache.org/jira/browse/CASSANDRA-8119 is also semi-related,
but a different approach (CL-as-UDF)

On Thu, Jun 8, 2017 at 9:26 PM, Brandon Williams  wrote:

> I don't disagree with you there and have never liked TWO/THREE.  This is
> somewhat relevant: https://issues.apache.org/jira/browse/CASSANDRA-2338
>
> I don't think going to CL.FOUR, etc, is a good long-term solution, but I'm
> also not sure what is.
>
>
> On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu  wrote:
>
> > To me, CL.TWO and CL.THREE are more like work around of the problem, for
> > example, they do not work if the number of replicas go to 8, which does
> > possible in our environment (2 replicas in each of 4 DCs).
> >
> > What people want from quorum is strong consistency guarantee, as long as
> > R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2), W=(n/2+1);
> c)
> > R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a),
> which
> > is the most expensive option.
> >
> > I can not think of a reason, that people want the quorum read, not for
> > strong consistency reason, but just to read from (n/2+1) nodes. If they
> > want strong consistency, then the read just needs (n/2) nodes, we are
> > purely waste the one extra request, and hurts read latency as well.
> >
> > Thanks
> > Dikang.
> >
> > On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall 
> > wrote:
> >
> >>
> >> We have CL.TWO.
> >>>
> >>>
> >>>
> >> This was actually the original motivation for CL.TWO and CL.THREE if
> >> memory serves:
> >> https://issues.apache.org/jira/browse/CASSANDRA-2013
> >>
> >
> >
> >
> > --
> > Dikang
> >
> >
>


Re: Definition of QUORUM consistency level

2017-06-08 Thread Justin Cameron
Firstly, this situation only occurs if you need strong consistency and are
using an even replication factor (RF4, RF6, etc).
Secondly, either the read or write still need to be performed at a minimum
level of QUORUM. This means there are no extra availability benefits from
your proposal (i.e. a minimum of QUORUM replicas still need to be online
and available)

So the only potential benefit I can think of is a theoretical performance
boost. If you write with QUORUM, then you'll need to read with
QUORUM-1/HALF (e.g. RF4, write with QUORUM, read with TWO, RF6 write with
QUORUM, read with THREE, RF8 write with QUORUM, read with FOUR, ...). At
most you'd only reduce the number of replicas that the client needs to
block on by 1.

I'd guess that the performance benefits that you'd gain will probably be
quite small - but I'd happily be proven wrong if you feel like running some
benchmarks :)

Cheers,
Justin

On Fri, 9 Jun 2017 at 14:26 Brandon Williams  wrote:

> I don't disagree with you there and have never liked TWO/THREE.  This is
> somewhat relevant: https://issues.apache.org/jira/browse/CASSANDRA-2338
>
> I don't think going to CL.FOUR, etc, is a good long-term solution, but I'm
> also not sure what is.
>
>
> On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu  wrote:
>
>> To me, CL.TWO and CL.THREE are more like work around of the problem, for
>> example, they do not work if the number of replicas go to 8, which does
>> possible in our environment (2 replicas in each of 4 DCs).
>>
>> What people want from quorum is strong consistency guarantee, as long as
>> R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2), W=(n/2+1); c)
>> R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a), which
>> is the most expensive option.
>>
>> I can not think of a reason, that people want the quorum read, not for
>> strong consistency reason, but just to read from (n/2+1) nodes. If they
>> want strong consistency, then the read just needs (n/2) nodes, we are
>> purely waste the one extra request, and hurts read latency as well.
>>
>> Thanks
>> Dikang.
>>
>> On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall 
>> wrote:
>>
>>>
>>> We have CL.TWO.



>>> This was actually the original motivation for CL.TWO and CL.THREE if
>>> memory serves:
>>> https://issues.apache.org/jira/browse/CASSANDRA-2013
>>>
>>
>>
>>
>> --
>> Dikang
>>
>>
> --


*Justin Cameron*Senior Software Engineer





This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


Re: Definition of QUORUM consistency level

2017-06-08 Thread Jeff Jirsa
Short of actually making ConsistencyLevel pluggable or adding/changing one
of the existing levels, an alternative approach would be to divide up the
cluster into either real or pseudo-datacenters (with RF=2 in each DC), and
then write with QUORUM (which would be 3 nodes, across any combination of
datacenters), and read with LOCAL_QUORUM (which would be 2 nodes in the
datacenter of the coordinator). You don't have to have distinct physical
DCs for this, but you'd need tooling to guarantee an even number of
replicas in each virtual datacenter.

It's an ugly workaround, but it'd work.

Pluggable CL would be nicer, though.


On Thu, Jun 8, 2017 at 9:51 PM, Justin Cameron 
wrote:

> Firstly, this situation only occurs if you need strong consistency and are
> using an even replication factor (RF4, RF6, etc).
> Secondly, either the read or write still need to be performed at a minimum
> level of QUORUM. This means there are no extra availability benefits from
> your proposal (i.e. a minimum of QUORUM replicas still need to be online
> and available)
>
> So the only potential benefit I can think of is a theoretical performance
> boost. If you write with QUORUM, then you'll need to read with
> QUORUM-1/HALF (e.g. RF4, write with QUORUM, read with TWO, RF6 write with
> QUORUM, read with THREE, RF8 write with QUORUM, read with FOUR, ...). At
> most you'd only reduce the number of replicas that the client needs to
> block on by 1.
>
> I'd guess that the performance benefits that you'd gain will probably be
> quite small - but I'd happily be proven wrong if you feel like running some
> benchmarks :)
>
> Cheers,
> Justin
>
> On Fri, 9 Jun 2017 at 14:26 Brandon Williams  wrote:
>
> > I don't disagree with you there and have never liked TWO/THREE.  This is
> > somewhat relevant: https://issues.apache.org/jira/browse/CASSANDRA-2338
> >
> > I don't think going to CL.FOUR, etc, is a good long-term solution, but
> I'm
> > also not sure what is.
> >
> >
> > On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu  wrote:
> >
> >> To me, CL.TWO and CL.THREE are more like work around of the problem, for
> >> example, they do not work if the number of replicas go to 8, which does
> >> possible in our environment (2 replicas in each of 4 DCs).
> >>
> >> What people want from quorum is strong consistency guarantee, as long as
> >> R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2),
> W=(n/2+1); c)
> >> R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a),
> which
> >> is the most expensive option.
> >>
> >> I can not think of a reason, that people want the quorum read, not for
> >> strong consistency reason, but just to read from (n/2+1) nodes. If they
> >> want strong consistency, then the read just needs (n/2) nodes, we are
> >> purely waste the one extra request, and hurts read latency as well.
> >>
> >> Thanks
> >> Dikang.
> >>
> >> On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall 
> >> wrote:
> >>
> >>>
> >>> We have CL.TWO.
> 
> 
> 
> >>> This was actually the original motivation for CL.TWO and CL.THREE if
> >>> memory serves:
> >>> https://issues.apache.org/jira/browse/CASSANDRA-2013
> >>>
> >>
> >>
> >>
> >> --
> >> Dikang
> >>
> >>
> > --
>
>
> *Justin Cameron*Senior Software Engineer
>
>
> 
>
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>