Re: Cassandra driver performance question...

Tony Anecito Sun, 23 Jun 2013 16:32:40 -0700

Hi Jabbar,
 
 I was able to get the performance issue resolved by reusing the connection 
object. It will be interesting to see what happens when I use a connection pool 
from a app server.
 
I still think it would be a good idea to have a minimal mode for metadata. It 
is rare I use metadata.
 
Regards,
-Tony

From: Tony Anecito <adanec...@yahoo.com>
To: "user@cassandra.apache.org" <user@cassandra.apache.org>; Tony Anecito 
<adanec...@yahoo.com> 
Sent: Friday, June 21, 2013 9:33 PM
Subject: Re: Cassandra driver performance question...

Hi Jabbar,

I think I know what is going on. I happened accross a change mentioned by the 
jdbc driver developers regarding metadata caching. Seems the metadata caching 
was moved from the connection object to the preparedStatement object. So I am 
wondering if the time difference I am seeing on the second preparedStatement 
object is because of the Metadata is cached then.

So my question is how to test this theory? Is there a way to stop the metadata 
from coming accross from Cassandra? A 20x performance improvement would be nice 
to have.

Thanks,
-Tony

From: Tony Anecito <adanec...@yahoo.com>
To: "user@cassandra.apache.org" <user@cassandra.apache.org> 
Sent: Friday, June 21, 2013 8:56 PM
Subject: Re: Cassandra driver performance question...

Thanks Jabbar,

I ran nodetool as suggested and it 0 latency for the row count I have.

I also ran cli list command for the table hit by my JDBC perparedStatement and 
it was slow like 121msecs the first time I ran it and second time I ran it it 
was 40msecs versus jdbc call of 38msecs to start with unless I run it twice 
also and get 1.5-2.5msecs for executeQuery the second time the 
preparedStatement is called.

I ran describe from cli for the table and it said caching is "ALL" which is 
correct.

A real mystery and I need to understand better what is going on.

Regards,
-Tony

From: Jabbar Azam <aja...@gmail.com>
To: user@cassandra.apache.org; Tony Anecito <adanec...@yahoo.com> 
Sent: Friday, June 21, 2013 3:32 PM
Subject: Re: Cassandra driver performance question...

Hello Tony, 

I would guess that the first queries data  is put into the row cache and the 
filesystem cache. The second query gets the data from the row cache and or the 
filesystem cache so it'll be faster.

If you want to make it consistently faster having a key cache will definitely 
help. The following advice from Aaron Morton will also help 
"You can also see what it looks like from the server side. 

nodetool proxyhistograms will show you full request latency recorded by the 
coordinator. 
nodetool cfhistograms will show you the local read latency, this is just the 
time it takes
to read data on a replica and does not include network or wait times. 

If the proxyhistograms is showing most requests running faster than your app 
says it's your
app."

http://mail-archives.apache.org/mod_mbox/cassandra-user/201301.mbox/%3ce3741956-c47c-4b43-ad99-dad8afc3a...@thelastpickle.com%3E

Thanks

Jabbar Azam

On 21 June 2013 21:29, Tony Anecito <adanec...@yahoo.com> wrote:

Hi All,
>I am using jdbc driver and noticed that if I run the same query twice the 
>second time it is much faster.
>I setup the row cache and column family cache and it not seem to make a 
>difference.
>
>
>I am wondering how to setup cassandra such that the first query is always as 
>fast as the second one. The second one was 1.8msec and the first 28msec for 
>the same exact paremeters. I am using preparestatement.
>
>
>Thanks!

Re: Cassandra driver performance question...

Reply via email to