Re: Authentication with Java driver

2017-02-08 Thread Yuji Ito
Thanks Ben, Do you mean lots of instances of the process or lots of instances of the > cluster/session object? Lots of instances of the process are generated. I wanted to confirm that `other` doesn't authenticate. If I want to avoid that, my application has to create new cluster/session objects

Extract big data to file

2017-02-08 Thread Cogumelos Maravilha
Hi list, My database stores data from Kafka. Using C* 3.0.10 In my cluster I'm using: AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} The result of extract one day of data uncompressed is around 360G. I've find these approaches: echo "SELECT kafka fr

FINAL REMINDER: CFP for ApacheCon closes February 11th

2017-02-08 Thread Rich Bowen
Dear Apache Enthusiast, This is your FINAL reminder that the Call for Papers (CFP) for ApacheCon Miami is closing this weekend - February 11th. This is your final opportunity to submit a talk for consideration at this event. This year, we are running several mini conferences in conjunction with t

Re: Questions about TWCS

2017-02-08 Thread Alain RODRIGUEZ
Hi John, I will try to answer you on those questions, relying on people around to correct me if I am wrong. It says to align partition keys to your TWCS windows. Is it generally the > case that calendar/date based partitions would align nicely with TWCS > windows such that we would end up with on

Cluster scaling

2017-02-08 Thread Branislav Janosik -T (bjanosik - AAP3 INC at Cisco)
Hi all, I have a cluster of three nodes and would like to ask some questions about the performance. I wrote a small benchmarking tool in java that mirrors (read, write) operations that we do in the real project. Problem is that it is not scaling like it should. The program runs two tests: o

Re: Cluster scaling

2017-02-08 Thread Jan Kesten
Hi Branislav, what is it you would expect? Some thoughts: Batches are often misunderstood, they work well only if they contain only one partition key - think of a batch of different sensor data to one key. If you group batches with many partition keys and/or do large batches this puts high l

Re: Time series data model and tombstones

2017-02-08 Thread John Sanda
I wanted to provide a quick update. I was able to patch one of the environments that is hitting the tombstone problem. It has been running TWCS for five days now, and things are stable so far. I also had a patch to the application code to implement date partitioning ready to go, but I wanted to see

Re: Time series data model and tombstones

2017-02-08 Thread DuyHai Doan
Thanks for the update. Good to know that TWCS give you more stability On Wed, Feb 8, 2017 at 6:20 PM, John Sanda wrote: > I wanted to provide a quick update. I was able to patch one of the > environments that is hitting the tombstone problem. It has been running > TWCS for five days now, and thi

Re: Cluster scaling

2017-02-08 Thread Anuj Wadehra
Hi Branislav, I quickly went through the code and noticed that you are updating RF from code and expecting that Cassandra would automatically distribute replicas as per the new RF. I think this is not how it works. After updating the RF, you need to run repair on all the nodes to make sure that

Composite partition key token

2017-02-08 Thread Branislav Janosik -T (bjanosik - AAP3 INC at Cisco)
Hi, I would like to ask how to calculate token for composite partition key using java api? For partition key made of one column I use cluster.getMetadata().newToken(newBuffer); But what if my key looks like this PRIMARY KEY ((parentResourceId,timeRT), childName)? I read that “:” is a separator

Re: Extract big data to file

2017-02-08 Thread Kiril Menshikov
Did you try to receive data through the code? cqlsh probably not the right tool to fetch 360G. > On Feb 8, 2017, at 12:34, Cogumelos Maravilha > wrote: > > Hi list, > > My database stores data from Kafka. Using C* 3.0.10 > > In my cluster I'm using: > AND compression = {'sstable_compression

Re: Extract big data to file

2017-02-08 Thread Justin Cameron
Ideally you would have the program/Spark job that receives the data from Kafka write it to a text file as it writes each row to Cassandra - that way you don't need to query Cassandra at all. If you need to dump this data ad-hoc, rather than on a regular schedule, your best bet is to write some cod

Re: Extract big data to file

2017-02-08 Thread Justin Cameron
Actually using BEGINTOKEN and ENDTOKEN will only give you what you want if you're using ByteOrderedPartitioner (not with the default murmur3). It also looks like *datetimestamp *is a clustering column so that suggestion probably wouldn't have applied anyway. On Wed, 8 Feb 2017 at 13:04 Justin Came

Current data density limits with Open Source Cassandra

2017-02-08 Thread Hannu Kröger
Hello, Back in the day it was recommended that max disk density per node for Cassandra 1.2 was at around 3-5TB of uncompressed data. IIRC it was mostly because of heap memory limitations? Now that off-heap support is there for certain data and 3.x has different data storage format, is that 3-

Re: Cluster scaling

2017-02-08 Thread Branislav Janosik -T (bjanosik - AAP3 INC at Cisco)
Hi Jan, Yes, you are right about the batches, I am working on a correction of the way we use batches, just like you mentioned. I monitored all those stats and seems that hardware is not he bottleneck. Thank you for the response and advise! Cheers, Branislav From: Jan Kesten Reply-To: "user@ca

Re: Cluster scaling

2017-02-08 Thread Branislav Janosik -T (bjanosik - AAP3 INC at Cisco)
Hi Anuj, Thank you for the response. I modify RF just to see what is the effect on the performance, there is no data in the datastore when I change its value. But I see my mistake and will definitely change it like you mentioned. Reading will not be used that much, we will mostly write into the

Re: Current data density limits with Open Source Cassandra

2017-02-08 Thread Ben Slater
The major issue we’ve seen with very high density (we generally say <2TB node is best) is manageability - if you need to replace a node or add node then restreaming data takes a *long* time and there we fairly high chance of a glitch in the universe meaning you have to start again before it’s done.

Re: Current data density limits with Open Source Cassandra

2017-02-08 Thread daemeon reiydelle
your MMV. Think of that storage limit as fairly reasonable for active data likely to tombstone. Add more for older/historic data. Then think about time to recover a node. *...* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Wed, Feb 8, 2017 at 2:14 PM, Ben S

Error when running nodetool cleanup after adding a new node to a cluster

2017-02-08 Thread Srinath Reddy
Hi, Trying to re-balacne a Cassandra cluster after adding a new node and I'm getting this error when running nodetool cleanup. The Cassandra cluster is running in a Kubernetes cluster. Cassandra version is 2.2.8 nodetool cleanup error: io.k8s.cassandra.KubernetesSeedProvider Fatal configuratio

Re: Error when running nodetool cleanup after adding a new node to a cluster

2017-02-08 Thread Harikrishnan Pillai
The cleanup has to run on other nodes Sent from my iPhone On Feb 8, 2017, at 9:14 PM, Srinath Reddy mailto:ksre...@gmail.com>> wrote: Hi, Trying to re-balacne a Cassandra cluster after adding a new node and I'm getting this error when running nodetool cleanup. The Cassandra cluster is runnin

Re: Error when running nodetool cleanup after adding a new node to a cluster

2017-02-08 Thread Srinath Reddy
Yes, I ran the nodetool cleanup on the other nodes and got the error. Thanks. > On 09-Feb-2017, at 11:12 AM, Harikrishnan Pillai > wrote: > > The cleanup has to run on other nodes > > Sent from my iPhone > > On Feb 8, 2017, at 9:14 PM, Srinath Reddy > wrote: > >>