Hi Aaron,
Thanks for reply. I did some more tests and it looks like the problem is
not in deletes/writes, it rather in reads (I do read before deleting).
It turns out that problem was in another CF which had wide row of 1.2GB
and row cache. Cassandra tries to read this row into cache and becomes
unresponsive. Disabling row cache on this CF helped to read through this
row and perform cleanup. It seems that Cassandra reads into cache all
columns, even those which were deleted (w/ tombstones) but not GCed.
Seems that CASSANDRA-2864
<https://issues.apache.org/jira/browse/CASSANDRA-2864> and
CASSANDRA-1956 <https://issues.apache.org/jira/browse/CASSANDRA-1956>
opened to address this problem.
Best,
Rustam.
On 04/06/2012 19:41, aaron morton wrote:
Delete is a no look write operation, like normal writes. So it should not be
directly causing a lot of memory allocation.
It may be causing a lot of compaction activity, which due to the wide row may
be throwing up lots of GC.
Try the following to get through the deletions:
* disable compaction by setting min_compaction_level and max_compaction_level
to 0 (via nodetool on current versions)
Once you have finished compaction
* lower the in_memory_compaction_limit in the yaml.
* set concurrent_compactions to 2 in the yaml
* enable compaction again
Once everything has settled down restore the in_memory_compaction_limit and
concurrent_compactions
Hope that helps.
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 2/06/2012, at 7:53 AM, Rustam Aliyev wrote:
Hi all,
I have SCF with ~250K rows. One of these rows is relatively large - it's a wide
row (according to compaction logs) containing ~100.000 super columns and
overall size of 1GB. Each super column has average size of 10K and ~10 sub
columns.
When I'm trying to delete ~90% of the columns in this particular row, Cassandra
nodes which own this wide row (3 of 5, RF=3) quickly run out of the heap space.
See logs from one of the hosts here:
http://pastebin.com/raw.php?i=kwn7b3rP
After that, all 3 nodes start flapping up/down and GC messages (like the one in the
bottom of the pastebin above) appearing in the logs. Cassandra never repairs from this
mode and the only way out if to "kill -9" and start again. On IRC it was
suggested that it enters GC death spiral.
I tried to throttle delete requests on the client side - sending batch of 100
delete requests each 500ms. So no more than 200 deletes/sec. But it didn't
help. I can reduce it further to 100/sec, but I don't think it will help much.
I delete millions of columns from other row in this SCF at the same rate and
never have hit this problem. It only happens when I try to delete from this
particular wide row.
So right now I don't know how can I delete these columns. Any ideas?
Many thanks,
Rustam.