Estimation of memtable size are wrong

Radim Kolar Fri, 23 Mar 2012 01:51:42 -0700

I wonder why are memtable estimations so bad.

1. its not possible to run them more often? There should be some limit -run live/serialized calculation at least once per hour. They took justfew seconds.2. Why not use data from FlusherWriter to update estimations? Flusherknows number of ops and serialized size after sstable is written todisk. These values should be used for updating memtable live/serializedratio.

INFO [OptionalTasks:1] 2012-03-23 09:33:51,765 MeteredFlusher.java(line 62) flushing high-traffic column family CFS(Keyspace='whois',ColumnFamily='ipbans') (estimated 105363280 bytes)INFO [OptionalTasks:1] 2012-03-23 09:33:51,796 ColumnFamilyStore.java(line 704) Enqueuing flush ofMemtable-ipbans@481336682(1317041/105363280 serialized/live bytes, 16755ops)

 ** Here should be noted that live/serialized size is ESTIMATED!! **

INFO [FlushWriter:314] 2012-03-23 09:33:51,796 Memtable.java (line246) Writing Memtable-ipbans@481336682(1317041/105363280 serialized/livebytes, 16755 ops)INFO [FlushWriter:314] 2012-03-23 09:33:51,799 Memtable.java (line283) Completed flushing/var/lib/cassandra/data/whois/ipbans-hc-16775-Data.db (1355 bytes)

Estimation of memtable size are wrong

Reply via email to