difference between AntiEntropySessions and AntiEntropyStage ?

2014-06-09 Thread DE VITO Dominique
Hi, Nodetool tpstats gives 2 lines for anti-entropy: one for AntiEntropySessions and one for AntiEntropyStage. What is the difference ? a) Is "AntiEntropySessions" for counting repairs on a node acting as a primary node (the target node for repair) ? And is "AntiEntropyStage" for countin

Re: Advice on how to handle corruption in system/hints

2014-06-09 Thread Colin Kuo
Hi Francois, We're facing the same issue like yours. The approach we did is to 1. scrub that corrupted data file 2. repair that column family Immediately delete that corrupted files is not suggested if C* instance is running. This might be happening if bad disk or power outage. Thanks, Colin

Re: high pending compactions

2014-06-09 Thread Colin Kuo
As Jake suggested, you could firstly increase "compaction_throughput_mb_per_sec" and "concurrent_compactions" to suitable values if system resource is allowed. From my understanding, major compaction will internally acquire lock before running compaction. In your case, there might be a major compac

Re: VPC AWS

2014-06-09 Thread Alain RODRIGUEZ
Hi guys, there is a lot of answer, it looks like this subject is interesting a lot of people, so I will end up letting you know how it went for us. For now, we are still doing some tests. Yet I would like to know how we are supposed to configure Cassandra in this environment : - VPC - Multiple d

RE: high pending compactions

2014-06-09 Thread S C
Thank you all for valuable suggestions. Couple more questions, How to check the compaction queue? MBean/C* system log ?What happens if the queue is full? From: colinkuo...@gmail.com Date: Mon, 9 Jun 2014 18:53:41 +0800 Subject: Re: high pending compactions To: user@cassandra.apache.org As Jake s

Re: VPC AWS

2014-06-09 Thread Peter Sanford
Your general assessments of the limitations of the Ec2 snitches seem to match what we've found. We're currently using the GossipingPropertyFileSnitch in our VPCs. This is also the snitch to use if you ever want to have a DC in EC2 and a DC with another hosting provider. -Peter On Mon, Jun 9, 201

RE: VPC AWS

2014-06-09 Thread Ackerman, Mitchell
Peter, I too am working on setting up a multi-region VPC Cassandra cluster. Each region is connected to each other via an OpenVPN tunnel, so we can use internal IP addresses for both the seeds and broadcast address. This allows us to use the EC2Snitch (my interpretation of the caveat that th

Re: Cassandra 2.0 unbalanced ring with vnodes after adding new node

2014-06-09 Thread Jeremiah D Jordan
That looks like you started the initial nodes with num tokens=1, then later switched to vnodes, by setting num tokens to 256, then added that new node with 256 vnodes to start. Am I right? Since you don't have very much data, the easiest way out of this will be to decommission the original nod

Re: CQLSSTableWriter memory leak

2014-06-09 Thread Jeremiah D Jordan
You probably want to re-think your data model here. 50 million rows per partition is not going to be optimal. You will be much better off keeping that down to hundreds of thousands per partition in a worst case. -Jeremiah On Jun 5, 2014, at 8:29 PM, Xu Zhongxing wrote: > Is writing too man

Re: high pending compactions

2014-06-09 Thread Chris Lohfink
Bean: org.apache.cassandra.db.CompactionManager also nodetool compactionstats gives you how many are in the queue + estimate of how many will be needed. in 1.1 you will OOM far before you hit the limit,. In theory though, the compaction executor is a little special cased and will actually thro

Re: Object mapper for CQL

2014-06-09 Thread Kevin Burton
Wow… this was an interesting thread! There are tons of options here.. interesting that I wasn't able to find them. What I ended up doing was just banging out a code generator that uses Velocity templates that generates POJOs and uses Jackson and standard naming conventions to mirror objects. It'

Re: difference between AntiEntropySessions and AntiEntropyStage ?

2014-06-09 Thread Yuki Morishita
AntiEntropySessions is where all repair sessions are executed. You can use this to count how many repair sessions are on going. AntiEntropyStage is used to handle repair messages. Usually you use AntiEntropySessions to check if repair is running. On Mon, Jun 9, 2014 at 2:59 AM, DE VITO Dominique

Re: Cassandra 2.0 unbalanced ring with vnodes after adding new node

2014-06-09 Thread Владимир Рудев
Hmm, maybe, actually cluster was created not by me. Another interesting thing was yesterday - by some reason one old node lost one sstable file(no matter how - thats another problem) and we shut down this node, clean up all data, and start again. After this result of nodetool status K was this

RE: high pending compactions

2014-06-09 Thread S C
Thank you all for quick responses. From: clohf...@blackbirdit.com Subject: Re: high pending compactions Date: Mon, 9 Jun 2014 14:11:36 -0500 To: user@cassandra.apache.org Bean: org.apache.cassandra.db.CompactionManager also nodetool compactionstats gives you how many are in the queue + estimate of

Cannot query secondary index

2014-06-09 Thread Redmumba
I have a table with a timestamp column on it; however, when I try to query based on it, it fails saying that I must use ALLOW FILTERING--which to me, means its not using the secondary index. Table definition is (snipping out irrelevant parts)... CREATE TABLE audit ( > id bigint, > date ti

Re: Cannot query secondary index

2014-06-09 Thread Jonathan Lacefield
Hello, You are receiving this item because you are not passing in the Partition Key as part of your query. Cassandra is telling you it doesn't know which node to find the data and you haven't explicitly told it to search across all your nodes for the data. The ALLOW FILTERING clause bypasses t

Re: Cannot query secondary index

2014-06-09 Thread Michal Michalski
Secondary indexes internally are just CFs that map the indexed value to a row key which that value belongs to, so you can only query these indexes using "=", not ">", ">=" etc. However, your query does not require index *IF* you provide a row key - you can use "<" or ">" like you did for the date

Re: Cannot query secondary index

2014-06-09 Thread Redmumba
Ah, so the secondary indices are really secondary against the primary key. That makes sense. I'm beginning to see why the whole "date-based table" approach is the only one I've been able to find... thanks for the quick responses, guys! On Mon, Jun 9, 2014 at 2:45 PM, Michal Michalski < michal.mi

Re: Cannot query secondary index

2014-06-09 Thread Redmumba
I've been trying to work around using "date-based tables" because I'd like to avoid the overhead. It seems, however, that this is just not going to work. So here's a question--for these date-based tables (i.e., a table per day/week/month/whatever), how are they queried? If I keep 60 days worth o

Re: Cannot query secondary index

2014-06-09 Thread Jonathan Lacefield
Hello, Will you please describe the use case and what you are trying to model. What are some questions/queries that you would like to serve via Cassandra. This will help the community help you a little better. Jonathan Lacefield Solutions Architect, DataStax (404) 822 3487

Re: Cannot query secondary index

2014-06-09 Thread Redmumba
Of course, Jonathan, I'll do my best! It's an auditing table that, right now, uses a primary key consisting of a combination of a combined partition id of the region and the object id, the date, and the process ID. Each event in our system will create anywhere from 1-20 rows, for example, and mul

How to restart bootstrap after a failed streaming due to Broken Pipe (1.2.16)

2014-06-09 Thread Mike Heffner
Hi, During an attempt to bootstrap a new node into a 1.2.16 ring the new node saw one of the streaming nodes periodically disappear: INFO [GossipTasks:1] 2014-06-10 00:28:52,572 Gossiper.java (line 823) InetAddress /10.156.1.2 is now DOWN ERROR [GossipTasks:1] 2014-06-10 00:28:52,574 AbstractStr

Re: How to restart bootstrap after a failed streaming due to Broken Pipe (1.2.16)

2014-06-09 Thread Colin Kuo
You can use "nodetool repair" instead. Repair is able to re-transmit the data which belongs to new node. On Tue, Jun 10, 2014 at 10:40 AM, Mike Heffner wrote: > Hi, > > During an attempt to bootstrap a new node into a 1.2.16 ring the new node > saw one of the streaming nodes periodically disap