Dne 19.3.2012 20:28, i...@4friends.od.ua napsal(a):
Hello
Datasize should decrease during minor compactions. Check logs for
compactions results.
they do but not as much as i expect. Look at sizes and file dates:
-rw-r--r-- 1 root wheel 5.4G Feb 23 17:03 resultcache-hc-27045-Data.db
-rw-r--r-- 1 root wheel 6.4G Feb 23 17:11 resultcache-hc-27047-Data.db
-rw-r--r-- 1 root wheel 5.5G Feb 25 06:40 resultcache-hc-27167-Data.db
-rw-r--r-- 1 root wheel 2.2G Mar 2 05:03 resultcache-hc-27323-Data.db
-rw-r--r-- 1 root wheel 2.0G Mar 5 09:15 resultcache-hc-27542-Data.db
-rw-r--r-- 1 root wheel 2.2G Mar 12 23:24 resultcache-hc-27791-Data.db
-rw-r--r-- 1 root wheel 468M Mar 15 03:27 resultcache-hc-27822-Data.db
-rw-r--r-- 1 root wheel 483M Mar 16 05:23 resultcache-hc-27853-Data.db
-rw-r--r-- 1 root wheel 53M Mar 17 05:33 resultcache-hc-27901-Data.db
-rw-r--r-- 1 root wheel 485M Mar 17 09:37 resultcache-hc-27930-Data.db
-rw-r--r-- 1 root wheel 480M Mar 19 00:45 resultcache-hc-27961-Data.db
-rw-r--r-- 1 root wheel 95M Mar 19 09:35 resultcache-hc-27967-Data.db
-rw-r--r-- 1 root wheel 98M Mar 19 17:04 resultcache-hc-27973-Data.db
-rw-r--r-- 1 root wheel 19M Mar 19 18:23 resultcache-hc-27974-Data.db
-rw-r--r-- 1 root wheel 19M Mar 19 19:50 resultcache-hc-27975-Data.db
-rw-r--r-- 1 root wheel 19M Mar 19 21:17 resultcache-hc-27976-Data.db
-rw-r--r-- 1 root wheel 19M Mar 19 22:05 resultcache-hc-27977-Data.db
I insert everything with 7days TTL + 10 days tombstone expiration. This
means that there should not be in ideal case nothing older then Mar 2.
These 3x5 GB files waits to be compacted. Because they contains only
tombstones, cassandra should make some optimalizations - mark sstable as
tombstone only, remember time of latest tombstone and delete entire
sstable without need to merge it first.
1. Question is why create tombstone after row expiration at all, because
it will expire at all cluster nodes at same time without need to be deleted.
2. Its super column family. When i dump oldest sstable, i wonder why it
looks like this:
{
"7777772c61727469636c65736f61702e636f6d": {},
"7175616b652d34": {"1": {"deletedAt": -9223372036854775808,
"subColumns": [["crc32","4f34455c",1328220892597002,"d"],
["id","4f34455c",1328220892597000,"d"],
["name","4f34455c",1328220892597001,"d"],
["size","4f34455c",1328220892597003,"d"]]}, "2": {"deletedAt":
-9223372036854775808, "subColumns":
[["crc32","4f34455c",1328220892597007,"d"],
["id","4f34455c",1328220892597005,"d"],
["name","4f34455c",1328220892597006,"d"],
["size","4f34455c",1328220892597008,"d"]]}, "3": {"deletedAt":
-9223372036854775808, "subColumns":
* all subcolums are deleted. why to keep their names in table? isnt
marking column as deleted enough? "1": {"deletedAt":
-9223372036854775808"} enough?
* another question is why was not tombstone entire row, because all its
members were expired.