Well the problem still persists. Giving cassandra 12G of heap and having a look at the table I saw that caching='KEYS_ONLY' . Did not find how to disable caching for rows (I'm not sure if setting to 0 will disable it)
On Friday, March 14, 2014 3:14 PM, Andras Szerdahelyi <andras.szerdahe...@ignitionone.com> wrote: Is row cache enabled on this CF? Try disabling it. Seems like you might have a very wide row there. Can you grep for GCInspector in your Cassandra log? 24G might be a bit too much for the Cassandra JVM, bogging down GC, and not leaving much to page cache ( 32G -24G - Tomcat ). I don’t quite understand your reasoning here: > (I know that there is a lot of heap but I also have write heavy tasks and I >want them to get into mem fast) . So, I would try with default cassandra-env.sh JVM params too From: Batranut Bogdan <batra...@yahoo.com> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>, Batranut Bogdan <batra...@yahoo.com> Date: Friday 14 March 2014 13:50 To: "user@cassandra.apache.org" <user@cassandra.apache.org> Subject: Cassandra slow on some reads Hello all, Here is the environment: I have a 6 node Cassandra cluster. On each node I have: - 32 G RAM - 24 G RAM for cassa - ~150 - 200 MB/s disk speed - tomcat 6 with axis2 webservice that uses the datastax java driver to make asynch reads / writes - replication factor for the keyspace is 3 (I know that there is a lot of heap but I also have write heavy tasks and I want them to get into mem fast) . All nodes in the same data center The clients that read / write are in the same datacenter so network is Gigabit. The table structure is like this: PK(key String , timestam int, column1 string, col2 string) , list1 , list 2, list 3 . There are about 300 milions individual keys. There are about 100 timestamps for each key now, so the rows will get wider as time passes. I am using datastax java driver to query the cluster. I have ~450 queries that are like this: SELECT * FROM table where key = 'some string' and ts = some value; some value is close to present time. The problem: About 10 - 20 % of these queries take more than 5 seconds to execute, in fact, the majority of those take around 10 seconds. When investigating I saw that if I have a slow response and I redo the query it will finish in 8 - 10 MILIseconds like the rest of the queries that I have. I could not see using JConsole any spikes in CPU / memory when executing the queries. The rise in resource consumtion is very small on all nodes on the cluster. I expect such delays to be generated by a BIG increase in resource consumption. Any comments will be appreciated. Thank you.