Hi Bryan,
Thanks for your quick response. We have already tuned our memory and GC based on our hardware specification and it was working fine until yesterday, i.e before facing the below specified delete request. As you specified we will once again look into our GC & memory configuration. FYKI : We are using memtable_allocation_typ as offheap_objects. Consider the following table CREATE TABLE EmployeeDetails ( branch_id text, department_id text, emp_id bigint, emp_details text, PRIMARY KEY (branch, department, emp_id) ) WITH CLUSTERING ORDER BY (department ASC, emp_id ASC) In this table I have 10 million records for the a particular branch_id and department_id . And following are the list of operation which I perform in C* in chronological order Deleting 5 million records, from the start, in batches of 500 records per request for the particular branch_id (say 'xxx' ) and department_id (say 'yyy') Read the next 500 records as soon the above delete operation is being completed ( Select * from EmployeeDetails where branch_id='xxx' and department_id = 'yyy' and emp_id >50000000 limit 500 ) It's only after executing the above read request there was a spike in memory and within few minutes the node has been marked down. So my question here is , will the above read request will load all the deleted 5 million records in my memory before it starts fetching or will it jump directly to the offset of 50000001 record (since we have specified the greater than condition) ? If its going to the former case then for sure the read request will keep the data in main memory and performs merge operation before it delivers the data as per this wiki( https://wiki.apache.org/cassandra/ReadPathForUsers ). If not let me know how the above specified read request will provide me the data . Note : And also while analyzing my heap dump its clear that majority of the memory is being held my Tombstone threads. Thanks in advance -- karthick ---- On Mon, 03 Jul 2017 20:40:10 +0530 Bryan Cheng <br...@blockcypher.com> wrote ---- This is a very antagonistic use case for Cassandra :P I assume you're familiar with Cassandra and deletes? (eg. http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html, http://docs.datastax.com/en/cassandra/2.1/cassandra/dml/dml_about_deletes_c.html) That being said, are you giving enough time for your tables to flush to disk? Deletes generate markers which can and will consume memory until they have a chance to be flushed, after which they will impact query time and performance (but should relieve memory pressure). If you're saturating the capability of your nodes your tables will have difficulty flushing. See http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_memtable_thruput_c.html. This could also be a heap/memory configuration issue as well or a GC tuning issue (although unlikely if you've left those at default) --Bryan On Mon, Jul 3, 2017 at 7:51 AM, Karthick V <karthick...@zohocorp.com> wrote: Hi, Recently In my test Cluster I faced a outrageous GC activity which made the Node unreachable inside the cluster itself. Scenario : In a Partition of 5Million rows we read first 500 (by giving the starting range) and delete the same 500 again.The same has been done recursively by changing the Start range alone. Initially I didn't see any difference in the query performance ( upto 50,000) but later I observed a significant increase in performance when reached about a 3.3Million the read request failed and the node went unreachable. After analysing my GC logs it is clear that 99% of my old-memory space is occupied and there are no more space for allocation it caused the machine stall. here my is doubt is that does all the deleted 3.3Million row will be loaded in my on-heap memory? if not what will be object that occupying those memory ?. PS : I am using C* 2.1.13 in cluster.