token distribution in multi-dc

2017-05-01 Thread vasu gunja
Hi , I have a question regarding token distribution in muti-dc setup. We are having multi-dc (DC1+DC2) setup with V-nodes enabled. How token ranges will be distributed in cluster ? Is complete cluster has completed one token range ? Or each DC has complete token range?

Seed nodes as part of cluster

2017-05-01 Thread Roman Naumenko
Hi, I’d like to confirm that seed nodes doesn’t contain any data. Is it correct? Can the instances for seed nodes be smaller size than for data nodes? Thank you Roman - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.o

Service discovery in the Cassandra cluster

2017-05-01 Thread Roman Naumenko
If I understand how Cassandra nodes work, they must contain a list of seed’s IP addressed in config file. This requirement makes cluster setup unnecessarily complicated. Is it possible to use DNS name for seed nodes? Thanks, — Roman -

Re: Seed nodes as part of cluster

2017-05-01 Thread vasu gunja
Seed will contain meta data + actual data too On Mon, May 1, 2017 at 3:34 PM, Roman Naumenko wrote: > Hi, > > I’d like to confirm that seed nodes doesn’t contain any data. Is it > correct? > > Can the instances for seed nodes be smaller size than for data nodes? > > Thank you > Roman > -

Re: Seed nodes as part of cluster

2017-05-01 Thread Roman Naumenko
So they are like any other “data” node… but special? I’m so freaking confused by this seed nodes design. — Roman > On May 1, 2017, at 1:37 PM, vasu gunja wrote: > > Seed will contain meta data + actual data too > > On Mon, May 1, 2017 at 3:34 PM, Roman Naumenko >

Re: Seed nodes as part of cluster

2017-05-01 Thread daemeon reiydelle
Caps below for emphasis, not shouting ;{) Seed nodes are IDENTICAL to all other node hdfs nodes or you will wish otherwise. Folks get confused because of terminoligy. I refer to this stuff as "the seed node service of a normal hdfs node". ANY HDFS NODE IS ABLE TO ACT AS A SEED NODE BY DEFINITION.

Re: Seed nodes as part of cluster

2017-05-01 Thread Roman Naumenko
Awesome, thanks for clarification. So why new nodes can’t connect to ANY seed node's IP that is returned by DNS? Why the IPs must be “hardcoded”? — Roman > On May 1, 2017, at 2:11 PM, daemeon reiydelle wrote: > > Caps below for emphasis, not shouting ;{) > > Seed nodes are IDENTICAL to all ot

Re: Service discovery in the Cassandra cluster

2017-05-01 Thread Jon Haddad
Sure, you could use DNS. Where does it say IP addresses are a requirement? > On May 1, 2017, at 1:36 PM, Roman Naumenko wrote: > > If I understand how Cassandra nodes work, they must contain a list of seed’s > IP addressed in config file. > > This requirement makes cluster setup unnecessarily

Re: Service discovery in the Cassandra cluster

2017-05-01 Thread daemeon reiydelle
Yes, you can use host names. That merely adds another level of configuration. When using terraform, I often use node names like and just use those. They are only routable within the region/VPC but are in fact already in dns. You do have to watch out as if you change the seeds (in tf) or the cluste

Re: Service discovery in the Cassandra cluster

2017-05-01 Thread Roman Naumenko
The docs mention IP addresses everywhere. http://docs.datastax.com/en/archived/cassandra/2.0/cassandra/operations/ops_replace_seed_node.html Promote an existing node to a seed node by adding its

Re: Service discovery in the Cassandra cluster

2017-05-01 Thread Jon Haddad
The in-tree docs do not mention this anywhere, and even have some of the answers you’re asking: https://cassandra.apache.org/doc/latest/faq/index.html?highlight=seed#what-are-seeds The DataStax docs are main

Re: Service discovery in the Cassandra cluster

2017-05-01 Thread Roman Naumenko
Well, I guess I have to figure out what’s up with IPs/hostnames by experiment. Information about service discovery is practically absent. Not to mention all important details about fqdns/hostnames, automatic replacing seed nodes or what not. — Roman > On May 1, 2017, at 4:14 PM, Jon Haddad wro

Re: Service discovery in the Cassandra cluster

2017-05-01 Thread Jon Haddad
Why do you have to figure out what’s up w/ them by accident? You’ve gotten all the information you need. Seeds are used to get the initial state of the cluster and as an optimization to spread gossip faster. That’s it. > On May 1, 2017, at 4:37 PM, Roman Naumenko wrote: > > Well, I gues

Re: Service discovery in the Cassandra cluster

2017-05-01 Thread Roman Naumenko
Lol yeah, why I guess I run some ec2 instances, drop some cassandra deb packages on 'em - the thing will figure out how to run... Also, how would you get "initial state of the cluster" if the cluster... is being initialized? Or that's easy, according to the docs - just hardcode some seed IPs into

Re: token distribution in multi-dc

2017-05-01 Thread Justin Cameron
Hi Vasu, Each DC has a complete token range. Cheers, Justin On Tue, 2 May 2017 at 06:32 vasu gunja wrote: > Hi , > > I have a question regarding token distribution in muti-dc setup. > > We are having multi-dc (DC1+DC2) setup with V-nodes enabled. > How token ranges will be distributed in clust

Migrating a cluster

2017-05-01 Thread Voytek Jarnot
Have a scenario where it's necessary to migrate a cluster to a different set of hardware with minimal downtime. Setup is: Current cluster: 4 nodes, RF 3 New cluster: 6 nodes, RF 3 My initial inclination is to follow this writeup on setting up the 6 new nodes as a new DC: https://docs.datastax.com

Re: Migrating a cluster

2017-05-01 Thread Justin Cameron
Yes - this is the recommended way to migrate to another DC. Before you start the migration you'll need to ensure 1. that the replication strategy of all your keyspaces is NetworkTopologyStrategy (if not, change it to this using ALTER KEYSPACE), and 2. that each of your clients is using the DcAware

Re: what is MemtableReclaimMemory mean ??

2017-05-01 Thread Pranay akula
Hi Alain, when "*MemtableReclaimMemory*" Pending Tasks increasing, its slowly backing up reads and writes mostly writes. yes i am seeing bit high GC pressure, currently we are using 24Gb Heap and G1GC collection. I tried changing Memtable flush threshold it did helped a little but not much. I a

Re: Migrating a cluster

2017-05-01 Thread Bhuvan Rawal
+1 to Justin's answer! As an additional step it's always good to run a full repair before deleting data on existing nodes, as there is a possibility of ioexceptions during rebuild. (Things like https://issues.apache.org/jira/browse/CASSANDRA-12830) Also if you are on 3.8+ , you may go for CDC app

Re: what is MemtableReclaimMemory mean ??

2017-05-01 Thread Chris Lohfink
Theres a read barrier to stop reclaiming a memtable when there are requests actively reading it. The *MemtableReclaimMemory* pool offloads that wait instead of blocking the caller. It in itself is not going to use any cpu or increase load. It will however block the releasing of the memtable resourc

Re: what is MemtableReclaimMemory mean ??

2017-05-01 Thread Chris Lohfink
Question though, how many tables do you have? If you have more than a few hundreds it could be bottlenecking the flushing if it is flushing very frequently. On Mon, May 1, 2017 at 9:32 PM, Chris Lohfink wrote: > Theres a read barrier to stop reclaiming a memtable when there are > requests active

Weird Bootstrapping Issue

2017-05-01 Thread Gareth Collins
Hi, We are running Cassandra 2.1.14 on an IBM AIX cluster using IBM Java 7 (1.7.1.64). I am having problems adding new nodes to the cluster. I am seeing the following exception. It appears like the new node is getting stuck trying to send the magic number on the first streaming socket...whilst the

Re: [Cassandra] nodetool compactionstats not showing pending task.

2017-05-01 Thread kurt greaves
I believe this is a bug with the estimation of tasks, however not aware of any JIRA that covers the issue. On 28 April 2017 at 06:19, Abhishek Kumar Maheshwari < abhishek.maheshw...@timesinternet.in> wrote: > Hi , > > > > I will try with JMX but I try with tpstats. In tpstats its showing pending