Tyler, thanks for explanation! So commit segment can contain both data from flushed table A and non-flushed table B.How is it replayed on start up? Does C* skip portions belonging to table A that already were written to SSTable? Regards, Vlad
On Tuesday, March 1, 2016 11:37 PM, Tyler Hobbs <ty...@datastax.com> wrote: On Tue, Mar 1, 2016 at 6:13 AM, Vlad <qa23d-...@yahoo.com> wrote: So commit log can't keep more than memtable size, why is difference in commit log and memtables sizes? In order to purge a commitlog segment, all memtables that contain data from that segment must be flushed to disk. Suppose you have two tables: - table A has extremely high throughput - table B has low throughput Every commitlog segment will have a mixture of writes for table A and table B. The memtable for table A will fill up rapidly and will be flushed frequently. The memtable for table B will slowly filly up, and will not be flushed often. Since table B's memtable isn't flushed, none of the commit log segments can purged/recycled. Once the commitlog hits its size limit, it will force a flush of table B. This behavior is good, because it allows table B to be flushed in large chunks instead of hundreds of tiny sstables. If the commitlog space were equal to the memtable space, Cassandra would have to force a flush of table B's memtable approximately every time table A is flushed, despite being much smaller. To summarize: if you use more than one table, it makes sense to have a larger space for commitlog segments. -- Tyler Hobbs DataStax