Tyler, thanks for explanation!
So commit segment can contain both data from flushed table A and non-flushed
table B.How is it replayed on start up? Does C* skip portions belonging to
table A that already were written to SSTable?
Regards, Vlad
On Tuesday, March 1, 2016 11:37 PM, Tyler Hobbs <[email protected]> wrote:
On Tue, Mar 1, 2016 at 6:13 AM, Vlad <[email protected]> wrote:
So commit log can't keep more than memtable size, why is difference in commit
log and memtables sizes?
In order to purge a commitlog segment, all memtables that contain data from
that segment must be flushed to disk.
Suppose you have two tables:
- table A has extremely high throughput
- table B has low throughput
Every commitlog segment will have a mixture of writes for table A and table B.
The memtable for table A will fill up rapidly and will be flushed frequently.
The memtable for table B will slowly filly up, and will not be flushed often.
Since table B's memtable isn't flushed, none of the commit log segments can
purged/recycled. Once the commitlog hits its size limit, it will force a flush
of table B.
This behavior is good, because it allows table B to be flushed in large chunks
instead of hundreds of tiny sstables. If the commitlog space were equal to the
memtable space, Cassandra would have to force a flush of table B's memtable
approximately every time table A is flushed, despite being much smaller.
To summarize: if you use more than one table, it makes sense to have a larger
space for commitlog segments.
--
Tyler Hobbs
DataStax