Re: Performance problem with large wide row inserts using CQL

2014-02-19 Thread Sylvain Lebresne
On Wed, Feb 19, 2014 at 9:38 PM, Rüdiger Klaehn wrote: > > I have cloned the cassandra repo, applied the patch, and built it. But > when I want to run the bechmark I get an exception. See below. I tried with > a non-managed dependency to > cassandra-driver-core-2.0.0-rc3-SNAPSHOT-jar-with-depende

Re: Performance problem with large wide row inserts using CQL

2014-02-19 Thread Yogi Nerella
Rüdiger, I have tried CQL only, and it was failing after 127 records added. I have to check what is wrong. I have the keyspace and table definiton exactly as you. I am new to scala, I do not know how to do this. I may try this in the evening. Yogi On Wed, Feb 19, 2014 at 2:50 PM, Rüdiger Kl

Re: Performance problem with large wide row inserts using CQL

2014-02-19 Thread Rüdiger Klaehn
This must be something to do with server side validation. If I define the table like this it does not happen: cqlsh> CREATE KEYSPACE IF NOT EXISTS test1 WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; cqlsh> use test1; cqlsh:test1> create TABLE employees2 (time blob,

Re: Performance problem with large wide row inserts using CQL

2014-02-19 Thread Yogi Nerella
I have a two node cluster. Tried with both 2.0.4 and 2.0.5. I have tried your code, and exactly after inserting 127 rows, the next insert fails. 10.566482102276002 123 2.7760618708015863 124 8.936212688296054 125 9.532923906962095 126 7.5081516753554505 127 java.lang.RuntimeException: failed to w

Re: Performance problem with large wide row inserts using CQL

2014-02-19 Thread Rüdiger Klaehn
On Wed, Feb 19, 2014 at 7:49 PM, Sylvain Lebresne wrote: > On Wed, Feb 19, 2014 at 11:27 AM, Rüdiger Klaehn wrote: > >> >> Am I doing something wrong, or is this a fundamental limitation of CQL. >> > > Neither. I believe you are running into > https://issues.apache.org/jira/browse/CASSANDRA-6737,

Re: Performance problem with large wide row inserts using CQL

2014-02-19 Thread Rüdiger Klaehn
Hi Yogi, Both benchmarks go to different tables. I originally wanted to just write a lot of data into an empty table and then evaluate what compression ratio I can expect when I ran into the performance problem. I am sorry, I forgot to mention this: I did not figure out how to create a table usin

Re: Cequel is a full-featured Ruby ORM for Cassandra

2014-02-19 Thread Matthew A. Brown
R is for “row” : ) (And generally I think the term “ORM” is well-understood even when applied to non-relational databases) On Wed, Feb 19, 2014 at 1:37 PM, Jeffrey Kesselman wrote: > why do you call it an ORM when Cassandra is not relational? > > > On Wed, Feb 19, 2014 at 11:39 AM, Matthew A.

Re: Performance problem with large wide row inserts using CQL

2014-02-19 Thread Sylvain Lebresne
On Wed, Feb 19, 2014 at 11:27 AM, Rüdiger Klaehn wrote: > > Am I doing something wrong, or is this a fundamental limitation of CQL. > Neither. I believe you are running into https://issues.apache.org/jira/browse/CASSANDRA-6737, which is a bug, a performance bug, which we should and will fix. So

Re: Performance problem with large wide row inserts using CQL

2014-02-19 Thread Yogi Nerella
Rudger, I am trying this on 2.0.5 to see, but both Scala code and AST code are going to different tables? Can you give the exact AST code you are trying? Yogi On Wed, Feb 19, 2014 at 10:49 AM, Sylvain Lebresne wrote: > On Wed, Feb 19, 2014 at 11:27 AM, Rüdiger Klaehn wrote: > >> >> Am I doin

Re: Cequel is a full-featured Ruby ORM for Cassandra

2014-02-19 Thread Jeffrey Kesselman
why do you call it an ORM when Cassandra is not relational? On Wed, Feb 19, 2014 at 11:39 AM, Matthew A. Brown wrote: > Hi all, > > I recently released version 1.0 of Cequel, > a high-level Ruby library for Cassandra using CQL3. Version 1.0 provides an > object-

Re: High CPU load on one node in the cluster

2014-02-19 Thread Yogi Nerella
You should start your Cassandra daemon with -verbose:gc (please check syntax) and then run it in foreground, as Cassandra closes the standard out) Please see other emails in this forum for getting Garbage Collection Statistics from Cassandra user mail, or look at any Java specific sites. Ex: http:

Re: High CPU load on one node in the cluster

2014-02-19 Thread Sourabh Agrawal
How do I get that statistic? On Wed, Feb 19, 2014 at 10:34 PM, Yogi Nerella wrote: > Could be your -Xmn800M is too low, that is why it is trying garbage > collecting very frequently. > Do you have any statistics on how much memory it is collecting on every > cycle? > > > > On Wed, Feb 19, 2014 a

Re: High CPU load on one node in the cluster

2014-02-19 Thread Yogi Nerella
Could be your -Xmn800M is too low, that is why it is trying garbage collecting very frequently. Do you have any statistics on how much memory it is collecting on every cycle? On Wed, Feb 19, 2014 at 8:47 AM, Sourabh Agrawal wrote: > Below is CPU usage from top. I don't see any steal. Idle time

Re: High CPU load on one node in the cluster

2014-02-19 Thread Sourabh Agrawal
Below is CPU usage from top. I don't see any steal. Idle time is pretty low. Cpu(s): 83.3%us, 14.5%sy, 0.0%ni, 0.5%id, 0.0%wa, 0.0%hi, 1.7%si, 0.0%st Any other pointers? On Wed, Feb 19, 2014 at 8:34 PM, Nate McCall wrote: > You may be seeing steal from another tenant on the VM. This art

Cequel is a full-featured Ruby ORM for Cassandra

2014-02-19 Thread Matthew A. Brown
Hi all, I recently released version 1.0 of Cequel , a high-level Ruby library for Cassandra using CQL3. Version 1.0 provides an object-oriented abstraction for CQL3 data modeling, including parent-child relationships using compound primary keys, and collection col

Re: Performance problem with large wide row inserts using CQL

2014-02-19 Thread DuyHai Doan
Agree with John Preparing a statement follows this process: 1) send the statement to the server 2) statement validation on server side 3) if validation is ok, the C* node will assign an UUID to this prepared statement 4) send back the UUID to the java driver core Now, you can re-use this s

Re: Performance problem with large wide row inserts using CQL

2014-02-19 Thread Nate McCall
Hi Rüdiger, I just saw this after I answered on the SO thread: http://stackoverflow.com/questions/21778671/cassandra-how-to-insert-a-new-wide-row-with-good-performance-using-cql/21884943#21884943 On Wed, Feb 19, 2014 at 8:57 AM, John Sanda wrote: > From a quick glance at your code, it looks lik

Re: High CPU load on one node in the cluster

2014-02-19 Thread Nate McCall
You may be seeing steal from another tenant on the VM. This article has a good explanation: http://blog.scoutapp.com/articles/2013/07/25/understanding-cpu-steal-time-when-should-you-be-worried In short, kill the instance and launch a new one. Depending on your latency requirements and operational

Re: Performance problem with large wide row inserts using CQL

2014-02-19 Thread John Sanda
>From a quick glance at your code, it looks like you are preparing your insert statement multiple times. You only need to prepare it once. I would expect to see some improvement with that change. On Wed, Feb 19, 2014 at 5:27 AM, Rüdiger Klaehn wrote: > Hi all, > > I am evaluating Cassandra for

Performance problem with large wide row inserts using CQL

2014-02-19 Thread Rüdiger Klaehn
Hi all, I am evaluating Cassandra for satellite telemetry storage and analysis. I set up a little three node cluster on my local development machine and wrote a few simple test programs. My use case requires storing incoming telemetry updates in the database at the same rate as they are coming in

High CPU load on one node in the cluster

2014-02-19 Thread Sourabh Agrawal
Hi, I am running cassandra 2.0.3 cluster on 4 AWS nodes. memory arguments are the following for each node : -Xms8G -Xmx8G -Xmn800M I am experiencing consistent high loads on one of the nodes. Each node is getting approximately equal number of writes. I tried to have a look at the logs and seems l