Re: Adding nodes in 1.2 with vnodes requires huge disks

2013-04-26 Thread Bryan Talbot
I believe that "nodetool rebuild" is used to add a new datacenter, not just a new host to an existing cluster. Is that what you ran to add the node? -Bryan On Fri, Apr 26, 2013 at 1:27 PM, John Watson wrote: > Small relief we're not the only ones that had this issue. > > We're going to try r

Re: Adding nodes in 1.2 with vnodes requires huge disks

2013-04-26 Thread John Watson
Small relief we're not the only ones that had this issue. We're going to try running a shuffle before adding a new node again... maybe that will help - John On Fri, Apr 26, 2013 at 5:07 AM, Francisco Nogueira Calmon Sobral < fsob...@igcorp.com.br> wrote: > I am using the same version and obser

Re: vnodes and load balancing - 1.2.4

2013-04-26 Thread Robert Coli
On Fri, Apr 26, 2013 at 3:48 AM, Sam Overton wrote: > If that is the case then it means you accidentally started those three nodes > with the default configuration (single-token) and then subsequently changed > (num_tokens) and then joined them into the cluster. This would seem to be another reas

Re: Is Cassandra oversized for this kind of use case?

2013-04-26 Thread Hiller, Dean
I would at least start with 3 cheap nodes with RF=3 and start with CL=TWO on writes and reads most likely getting your feet wet. Don't buy very expensive computers like a lot do getting into the game for the first time…Every time I walk into a new gig, they seem to think they need to spend 6/10

Re: Is Cassandra oversized for this kind of use case?

2013-04-26 Thread Marc Teufel
Okay one billion rows of data is a lot, compared to that i am far far away - means i can stay with Oracle? Maybe. But you're right when you say its not only about big data but also about your need. So storing the data is one part, doing analytical analysis is the second. I do a lot of calculations

cost estimate about some Cassandra patchs

2013-04-26 Thread DE VITO Dominique
Hi, We are created a new partitioner that groups some rows with **different** row keys on the same replicas. But neither the batch_mutate, or the multiget_slice are able to take opportunity of this partitioner-defined placement to vectorize/batch communications between the coordinator and the

Re: CQL update and TTL

2013-04-26 Thread Sylvain Lebresne
> is there a way to either make TTL dynamic (using ?) > Not at this time. There is https://issues.apache.org/jira/browse/CASSANDRA-4450 open for that, but that's not done yet. > tell the engine not to cache the Prepared statement. I am using the new > CQL Java Driver. > In that case, just don'

Re: CQL update and TTL

2013-04-26 Thread Shahryar Sedghi
Thanks Sylvain So how to avoid the prepared statement cache exhaustion, is there a way to either make TTL dynamic (using ?) or tell the engine not to cache the Prepared statement. I am using the new CQL Java Driver. Shahryar On Fri, Apr 26, 2013 at 11:42 AM, Sylvain Lebresne wrote: > This is

Re: CQL update and TTL

2013-04-26 Thread Alain RODRIGUEZ
That is more or less what I was guessing, thanks for these precision. 2013/4/26 Sylvain Lebresne > This is indeed intended. That behavior is largely dictated by how the > storage engine works, and the fact that an update does no read internally > in particular. > > Yet, what I do not know is wh

Re: CQL update and TTL

2013-04-26 Thread Sylvain Lebresne
This is indeed intended. That behavior is largely dictated by how the storage engine works, and the fact that an update does no read internally in particular. Yet, what I do not know is whether this behavior can be changed somehow to > let the initial TTL, > There's nothing like that supported, n

Re: CQL update and TTL

2013-04-26 Thread Shahryar Sedghi
The issue is, I can get the original TTL using the select and use it for the update, however since TTL can not be dynamic (using ?) it will exhaust the prepared statement cache, because I have tons of updates like this and every one will have a different signature due to changing TTL. I am using

Re: CQL update and TTL

2013-04-26 Thread Alain RODRIGUEZ
This seems to be the correct behavior. An update refreshes the TTL, as it does in memcache for example. Yet, what I do not know is whether this behavior can be changed somehow to let the initial TTL, this might be useful on some use cases. Alain 2013/4/26 Shahryar Sedghi > Apparently when I up

CQL update and TTL

2013-04-26 Thread Shahryar Sedghi
Apparently when I update a column using CQL that already has a TTL, it resets the TTL to null, so if there was already a TTL for all columns that I inserted part of a composite column set, this specific column that I updated will not expire while the others are are getting expired. Is it how it is

Re: Performance / limitations of WHERE ... IN queries

2013-04-26 Thread Thierry Templier
Thanks very much, Aaron, for your answer! Thierry You are effectively doing a multi get. Getting more than one row at a time is normally faster, but there will be a drop off point where the improvements slow down. Run some tests. Also consider that each row you requests creates RF number of c

Re: vnodes and load balancing - 1.2.4

2013-04-26 Thread David McNelis
Decommissioning those nodes isn't a problem. When you say remove all the data, I assume you mean rm -rf my data directory (the default /var/lib/cassandra/data I'd done this prior to starting up the nodes, because they were installed with from the apt-get repo, which automatically starts cassandra

Re: Is Cassandra oversized for this kind of use case?

2013-04-26 Thread Hiller, Dean
Well, it depends more on what you will do with the data. I know I was on a sybase(RDBMS) with 1 billion rows but it was getting close to not being able to handle more (constraints had to be turned off and all sorts of optimizations done and expert consultants brought in and everything). BUT th

Is Cassandra oversized for this kind of use case?

2013-04-26 Thread Marc Teufel
I hope the Cassandra Community can help me finding a decision. The project i am working on actually is located in industrial plant, machines are connected to a server an every 5 minutes i get data from the machines about its status. We are talking about a production with 100+ machines, so the data

Re: Really odd issue (AWS related?)

2013-04-26 Thread Michael Theroux
Thanks. We weren't monitoring this value when the issue occurred, and this particular issue has not appeared for a couple of days (knock on wood). Will keep an eye out though, -Mike On Apr 26, 2013, at 5:32 AM, Jason Wee wrote: > top command? st : time stolen from this vm by the hypervisor >

lastest PlayOrm released for cassandra and mongodb

2013-04-26 Thread Hiller, Dean
PlayOrm now supports mongodb and cassandra with a query language that is portable across both systems as well. https://github.com/deanhiller/playorm Later, Dean

Re: Deletes, null values

2013-04-26 Thread Sorin Manolache
On 2013-04-26 11:55, Alain RODRIGUEZ wrote: Of course: From CQL 2 (cqlsh -2): delete '183#16684','183#16714','183#16717' from myCF where key = 'all'; And selecting this data as follow gives me the result above: select '1228#16857','1228#16866','1228#16875','1237#16544','1237#16553' from myCF

Slow retrieval using secondary indexes

2013-04-26 Thread Francisco Nogueira Calmon Sobral
Hi all! We are using Cassandra 1.2.1 with a 8 node cluster running at Amazon. We started with 6 nodes and added the 2 later. When performing some reads in Cassandra, we observed a high difference between gets using the primary key and gets using secondary indexes: [default@Sessions] get Users

Re: Adding nodes in 1.2 with vnodes requires huge disks

2013-04-26 Thread Francisco Nogueira Calmon Sobral
I am using the same version and observed something similar. I've added a new node, but the instructions from Datastax did not work for me. Then I ran "nodetool rebuild" on the new node. After finished this command, it contained two times the load of the other nodes. Even when I ran "nodetool cl

Many creation/inserts in parallel

2013-04-26 Thread Sasha Yanushkevich
Hi All We are testing Cassandra 1.2.3 (3 nodes with RF:2) with FluentCassandra driver. At first many CF are being created in parallel (about 1000 CF). After creation is done follows many insertions of little amount of data into the DB. During tests we're receiving some exceptions from driver, e.g.

Re: vnodes and load balancing - 1.2.4

2013-04-26 Thread Sam Overton
Some extra information you could provide which will help debug this: the logs from those 3 nodes which have no data and the output of "nodetool ring" Before seeing those I can only guess, but my guess would be that in the logs on those 3 nodes you will see this: "Calculating new tokens" and this:

Re: Deletes, null values

2013-04-26 Thread Alain RODRIGUEZ
I copied the wrong query: In CQL 2 it was: delete '1228#16857','1228#16866','1228#16875' from myCF where key = 'all'; Sorry about the mistake. 2013/4/26 Alain RODRIGUEZ > Of course: > > From CQL 2 (cqlsh -2): > > delete '183#16684','183#16714','183#16717' from myCF where key = 'all'; > > And

Re: How to change existing cluster to multi-center

2013-04-26 Thread Alain RODRIGUEZ
I just asked this exact same question but after maybe after reading a bit more doc than you did. You may want to read this thread: http://grokbase.com/t/cassandra/user/134j85av4x/ec2snitch-to-ec2multiregionsnitch You also may want to read some doc. Datastax explain things quite well and update the

Re: Deletes, null values

2013-04-26 Thread Alain RODRIGUEZ
Of course: >From CQL 2 (cqlsh -2): delete '183#16684','183#16714','183#16717' from myCF where key = 'all'; And selecting this data as follow gives me the result above: select '1228#16857','1228#16866','1228#16875','1237#16544','1237#16553' from myCF where key = 'all'; >From thrift (phpCassa cl

Re: Really odd issue (AWS related?)

2013-04-26 Thread Jason Wee
top command? st : time stolen from this vm by the hypervisor jason On Fri, Apr 26, 2013 at 9:54 AM, Michael Theroux wrote: > Sorry, Not sure what CPU steal is :) > > I have AWS console with detailed monitoring enabled... things seem to > track close to the minute, so I can see the CPU load go t

Re: Unable to drop secondary index

2013-04-26 Thread Michal Michalski
W dniu 26.04.2013 03:45, aaron morton pisze: You can drop the hints via JMX and stopping the node and deleting the SSTables. Thanks for advice :-) It's +/- what I did. I've paused hints delivery first and then I upgraded whole cluster to C* with CASSANDRA-5179 patch applied, removing the SSTa

CQL indexing

2013-04-26 Thread Sri Ramya
HI In cql to perform a query based on columns you have to create a index on that column. What exactly happening when we create a index on a column. What the index column family might contain.