Is there a way to add a new node to a cluster but not sync old data?

2015-01-09 Thread Yatong Zhang
Hi there, I am using C* 2.0.10 and I was trying to add a new node to a cluster(actually replace a dead node). But after added the new node some other nodes in the cluster had a very high work-load and affected the whole performance of the cluster. So I am wondering is there a way to add a new node

Re: User audit in Cassandra

2015-01-09 Thread DuyHai Doan
What you want is something like audit logger like the one provided by DSE ? ( http://www.datastax.com/2014/12/enhanced-enterprise-security-in-datastax-enterprise-4-6 ) On Thu, Jan 8, 2015 at 1:34 PM, Ajay wrote: > Hi, > > Is there a way to enable user audit or trace if we have enabled > Password

Re: Cassandra primary key design to cater range query

2015-01-09 Thread Ajay
Hi, I read somewhere that the order of columns in the cluster key matters. Please correct me if I am wrong. For example, PRIMARY KEY((prodgroup), status, productid). Then the below query cannot run, select * from product where prodgroup='xyz' and prodid > 0 But this query can be run: select *

Re: User audit in Cassandra

2015-01-09 Thread Ajay
Thanks Tyler Hobbs. We need to capture what are the queries ran by a user in a session and its time taken. (don't need query plan or so). Is that possible? With Authenticator we can capture only the session creation right? Thanks Ajay On Sat, Jan 10, 2015 at 6:07 AM, Tyler Hobbs wrote: > sys

Re: User audit in Cassandra

2015-01-09 Thread Tyler Hobbs
system_traces is for query tracing, which is for diagnosing performance problems, not logging activity. Cassandra is designed to allow you to write your own Authenticator pretty easily. You can just subclass PasswordAuthenticator and add logging where desired. Compile that into a jar, put it in

Re: Cassandra primary key design to cater range query

2015-01-09 Thread Tyler Hobbs
Your proposed model for the table to handle the last query looks good, so I would stick with that. On Mon, Jan 5, 2015 at 5:45 AM, Nagesh wrote: > Hi All, > > I have designed a column family > > prodgroup text, prodid int, status int, , PRIMARY KEY ((prodgroup), > prodid, status) > > The data mo

Re: sstable structure

2015-01-09 Thread Tyler Hobbs
sstable2json can give you a pretty good idea of the format. Otherwise, your best option is to read the code, starting with org.apache.cassandra.io.sstable.SSTableWriter. On Fri, Jan 2, 2015 at 12:27 PM, Nikolay Mihaylov wrote: > Hi > > from some time I try to find the structure of sstable is it

Re: Updated JMX metrics overview

2015-01-09 Thread Tyler Hobbs
On Thu, Jan 8, 2015 at 9:57 AM, Reik Schatz wrote: > > > org.apache.cassandra.db type=StorageProxy TotalHints - is this the > number of hints since the node was started or a lifetime value > Since the node was started. > > org.apache.cassandra.db type=StorageProxy ReadRepairRepairedBackground

Re: How to bulkload into a specific data center?

2015-01-09 Thread Robert Coli
On Fri, Jan 9, 2015 at 11:38 AM, Benyi Wang wrote: > >- Is it possible to modify SSTableLoader to allow it access one data >center? > > Even if you only write to nodes in DC A, if you replicate that data to DC B, it will have to travel over the WAN anyway? What are you trying to avoid?

Re: How to bulkload into a specific data center?

2015-01-09 Thread Benyi Wang
Hi Ryan, Thanks for your reply. Now I understood how SSTableLoader works. - If I understand correctly, the current o.a.c.io.sstable.SSTableLoader doesn't use LOCAL_ONE or LOCAL_QUORUM. Is it right? - Is it possible to modify SSTableLoader to allow it access one data center? Because I

Re: nodetool repair

2015-01-09 Thread Robert Coli
On Fri, Jan 9, 2015 at 8:01 AM, Adil wrote: > We have two DC, we are planning to schedule running nodetool repair > weekly, my question is : nodetool repair is cross cluster or not? it's > sufficient to run it without options on a node or should be scheduled on > every node with the host option.

nodetool repair

2015-01-09 Thread Adil
Hi guys, We have two DC, we are planning to schedule running nodetool repair weekly, my question is : nodetool repair is cross cluster or not? it's sufficient to run it without options on a node or should be scheduled on every node with the host option. Thanks

Re: High read latency after data volume increased

2015-01-09 Thread Roni Balthazar
Hi there, The compaction remains running with our workload. We are using SATA HDDs RAIDs. When trying to run cfhistograms on our user_data table, we are getting this message: nodetool: Unable to compute when histogram overflowed Please see what happens when running some queries on this cf: http:

Re: High read latency after data volume increased

2015-01-09 Thread datastax
Hello You may not be experiencing versioning issues. Do you know if compaction is keeping up with your workload? The behavior described in the subject is typically associated with compaction falling behind or having a suboptimal compaction strategy configured. What does the output of nod

RE: High read latency after data volume increased

2015-01-09 Thread Jason Kushmaul | WDA
I was about to say I thought 2.1 was a development version, but when I went to prove that to myself: http://cassandra.apache.org/download/ “ The latest stable release of Apache Cassandra is 2.1.2 (released on 2014-11-10). If you're just starting out, download this one.” But then, after visiting

Re: High read latency after data volume increased

2015-01-09 Thread Brian Tarbox
C* seems to have more than its share of "version x doesn't work, use version y " type issues On Thu, Jan 8, 2015 at 2:23 PM, Robert Coli wrote: > On Thu, Jan 8, 2015 at 11:14 AM, Roni Balthazar > wrote: > >> We are using C* 2.1.2 with 2 DCs. 30 nodes DC1 and 10 nodes DC2. >> > > https://eng

Re: C* throws OOM error despite use of automatic paging

2015-01-09 Thread Jens-U. Mozdzen
Hi Mohammed, Zitat von Mohammed Guller : Hi - We have an ETL application that reads all rows from Cassandra (2.1.2), filters them and stores a small subset in an RDBMS. Our application is using Datastax's Java driver (2.1.4) to fetch data from the C* nodes. Since the Java driver supports

Re: C* throws OOM error despite use of automatic paging

2015-01-09 Thread DuyHai Doan
What is the data size of the column family you're trying to fetch with paging ? Are you storing big blob or just primitive values ? On Fri, Jan 9, 2015 at 8:33 AM, Mohammed Guller wrote: > Hi – > > > > We have an ETL application that reads all rows from Cassandra (2.1.2), > filters them and sto