Hi Jabbar, I was able to get the performance issue resolved by reusing the connection object. It will be interesting to see what happens when I use a connection pool from a app server. I still think it would be a good idea to have a minimal mode for metadata. It is rare I use metadata. Regards, -Tony
From: Tony Anecito <adanec...@yahoo.com> To: "user@cassandra.apache.org" <user@cassandra.apache.org>; Tony Anecito <adanec...@yahoo.com> Sent: Friday, June 21, 2013 9:33 PM Subject: Re: Cassandra driver performance question... Hi Jabbar, I think I know what is going on. I happened accross a change mentioned by the jdbc driver developers regarding metadata caching. Seems the metadata caching was moved from the connection object to the preparedStatement object. So I am wondering if the time difference I am seeing on the second preparedStatement object is because of the Metadata is cached then. So my question is how to test this theory? Is there a way to stop the metadata from coming accross from Cassandra? A 20x performance improvement would be nice to have. Thanks, -Tony From: Tony Anecito <adanec...@yahoo.com> To: "user@cassandra.apache.org" <user@cassandra.apache.org> Sent: Friday, June 21, 2013 8:56 PM Subject: Re: Cassandra driver performance question... Thanks Jabbar, I ran nodetool as suggested and it 0 latency for the row count I have. I also ran cli list command for the table hit by my JDBC perparedStatement and it was slow like 121msecs the first time I ran it and second time I ran it it was 40msecs versus jdbc call of 38msecs to start with unless I run it twice also and get 1.5-2.5msecs for executeQuery the second time the preparedStatement is called. I ran describe from cli for the table and it said caching is "ALL" which is correct. A real mystery and I need to understand better what is going on. Regards, -Tony From: Jabbar Azam <aja...@gmail.com> To: user@cassandra.apache.org; Tony Anecito <adanec...@yahoo.com> Sent: Friday, June 21, 2013 3:32 PM Subject: Re: Cassandra driver performance question... Hello Tony, I would guess that the first queries data is put into the row cache and the filesystem cache. The second query gets the data from the row cache and or the filesystem cache so it'll be faster. If you want to make it consistently faster having a key cache will definitely help. The following advice from Aaron Morton will also help "You can also see what it looks like from the server side. nodetool proxyhistograms will show you full request latency recorded by the coordinator. nodetool cfhistograms will show you the local read latency, this is just the time it takes to read data on a replica and does not include network or wait times. If the proxyhistograms is showing most requests running faster than your app says it's your app." http://mail-archives.apache.org/mod_mbox/cassandra-user/201301.mbox/%3ce3741956-c47c-4b43-ad99-dad8afc3a...@thelastpickle.com%3E Thanks Jabbar Azam On 21 June 2013 21:29, Tony Anecito <adanec...@yahoo.com> wrote: Hi All, >I am using jdbc driver and noticed that if I run the same query twice the >second time it is much faster. >I setup the row cache and column family cache and it not seem to make a >difference. > > >I am wondering how to setup cassandra such that the first query is always as >fast as the second one. The second one was 1.8msec and the first 28msec for >the same exact paremeters. I am using preparestatement. > > >Thanks!