google ;-)

On Aug 8, 2014, at 7:33 PM, Kevin Burton <bur...@spinn3r.com> wrote:

> hm.. as a side note, it's amazing how much cassandra information is locked up 
> in JIRAs… wonder if there's a way to compute automatically the JIRAs with 
> important information.
> 
> 
> On Fri, Aug 8, 2014 at 5:14 PM, graham sanderson <gra...@vast.com> wrote:
> See https://issues.apache.org/jira/browse/CASSANDRA-5935
> 
> 2.1 has a radically different implementation that side steps this (with off 
> heap memtables), but if you really want lots of tables now you can do so as a 
> trade off against GC behavior.
> 
> The problem is not SSTables per se, but more potentially one memtable per CF 
> (and with slab allocator that can/does cost 1M); I am not familiar enough 
> with the code to know when you would have 1 memtable vs 0 memtable for a CF 
> that isn’t currently actively used.
> 
> Note also https://issues.apache.org/jira/browse/CASSANDRA-6602 and friends; 
> there is definitely a need for efficient discarding of old data in event 
> streams.
> 
> 
> On Aug 8, 2014, at 2:29 PM, Kevin Burton <bur...@spinn3r.com> wrote:
> 
>> The "conventional wisdom" says that it's ideal to only use "in the low 
>> hundreds" in the number of tables with cassandra as each table can use 1MB 
>> or so of heap.  So if you have 1000 tables you'd have 1GB of heap used 
>> (which is no fun).
>> 
>> But is this an issue with the tables themselves or the SSTables?
>> 
>> I think the root of this is the SSTables as all the arena overhead will be 
>> for the SSTables too and more SSTables means more overhead.
>> 
>> So by adding more tables, you end up with more SSTables which means more 
>> heap memory.
>> 
>> If I'm in correct then this means that Cassandra could benefit from table 
>> partitioning.  Whereby you put all values in a specific region to a specific 
>> set of tables.
>> 
>> So if you were storing log data, you could store it in hourly, or daily 
>> partitions, but view the table as one logical unit.
>> 
>> the benefit here is that you could easily just drop the oldest data.  So if 
>> you need to clean up data, you wouldn't have to drop the whole table, just a 
>> days worth of the data. 
>> 
>> And since that day is just one SSTable on disk, the drop would be easy.. no 
>> tombstones, just delete the whole SSTable.
>> 
>> 
>> 
>> -- 
>> 
>> Founder/CEO Spinn3r.com
>> Location: San Francisco, CA
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> 
>> 
> 
> 
> 
> 
> -- 
> 
> Founder/CEO Spinn3r.com
> Location: San Francisco, CA
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> 
> 

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to