I'm confused: there is no flush_largest_memtables_at property in C* 2.0?
On 4 June 2014 12:55, Idrén, Johan <johan.id...@dice.se> wrote: > Ok, so the overhead is a constant modifier, right. > > > The 3x I arrived at with the following assumptions: > > > heap is 10GB > > Default memory for memtable usage is 1/4 of heap in c* 2.0 > max memory used for memtables is 2,5GB (10/4) > > flush_largest_memtables_at is 0.75 > > flush largest memtables when memtables use 7,5GB (3/4 of heap, 3x of the > default) > > > With an overhead of 10x, it makes sense that my memtable is flushed when > the jmx data says it is at ~250MB, ie 2,5GB, ie 1/4 of the heap > > > After I've set the memtable_total_size_in_mb to a value larger than > 7,5GB, it should still not go over 7,5GB on account of > flush_largest_memtables_at, 3/4 the heap > > > So I would expect to see memtables flushed to disk after they're being > reportedly at around 750MB. > > > Having memtable_total_size_in_mb set to 20480, memtables are flushed at > a reported value of ~2GB. > > > With a constant overhead, this would mean that it used 20GB, which is 2x > the size of the heap, instead of 3/4 of the heap as it should be if > flush_largest_memtables_at was being respected. > > > This shouldn't be possible. > > > ------------------------------ > *From:* Benedict Elliott Smith <belliottsm...@datastax.com> > *Sent:* Wednesday, June 4, 2014 1:19 PM > > *To:* user@cassandra.apache.org > *Subject:* Re: memtable mem usage off by 10? > > Unfortunately it looks like the heap utilisation of memtables was not > exposed in earlier versions, because they only maintained an estimate. > > The overhead scales linearly with the amount of data in your memtables > (assuming the size of each cell is approx. constant). > > flush_largest_memtables_at is an independent setting to > memtable_total_space_in_mb, and generally has little effect. Ordinarily > sstable flushes are triggered by hitting the memtable_total_space_in_mb > limit. I'm afraid I don't follow where your 3x comes from? > > > On 4 June 2014 12:04, Idrén, Johan <johan.id...@dice.se> wrote: > >> Aha, ok. Thanks. >> >> >> Trying to understand what my cluster is doing: >> >> >> cassandra.db.memtable_data_size only gets me the actual data but not >> the memtable heap memory usage. Is there a way to check for heap memory >> usage? >> >> >> I would expect to hit the flush_largest_memtables_at value, and this >> would be what causes the memtable flush to sstable then? By default 0.75? >> >> >> Then I would expect the amount of memory to be used to be maximum ~3x >> of what I was seeing when I hadn't set memtable_total_space_in_mb (1/4 by >> default, max 3/4 before a flush), instead of close to 10x (250mb vs 2gb). >> >> >> This is of course assuming that the overhead scales linearly with the >> amount of data in my table, we're using one table with three cells in this >> case. If it hardly increases at all, then I'll give up I guess :) >> >> At least until 2.1.0 comes out and I can compare. >> >> >> BR >> >> Johan >> >> >> ------------------------------ >> *From:* Benedict Elliott Smith <belliottsm...@datastax.com> >> *Sent:* Wednesday, June 4, 2014 12:33 PM >> >> *To:* user@cassandra.apache.org >> *Subject:* Re: memtable mem usage off by 10? >> >> These measurements tell you the amount of user data stored in the >> memtables, not the amount of heap used to store it, so the same applies. >> >> >> On 4 June 2014 11:04, Idrén, Johan <johan.id...@dice.se> wrote: >> >>> I'm not measuring memtable size by looking at the sstables on disk, >>> no. I'm looking through the JMX data. So I would believe (or hope) that I'm >>> getting relevant data. >>> >>> >>> If I have a heap of 10GB and set the memtable usage to 20GB, I would >>> expect to hit other problems, but I'm not seeing memory usage over 10GB for >>> the heap, and the machine (which has ~30gb of memory) is showing ~10GB >>> free, with ~12GB used by cassandra, the rest in caches. >>> >>> >>> Reading 8k rows/s, writing 2k rows/s on a 3 node cluster. So it's not >>> idling. >>> >>> >>> BR >>> >>> Johan >>> >>> >>> ------------------------------ >>> *From:* Benedict Elliott Smith <belliottsm...@datastax.com> >>> *Sent:* Wednesday, June 4, 2014 11:56 AM >>> *To:* user@cassandra.apache.org >>> *Subject:* Re: memtable mem usage off by 10? >>> >>> If you are storing small values in your columns, the object overhead >>> is very substantial. So what is 400Mb on disk may well be 4Gb in memtables, >>> so if you are measuring the memtable size by the resulting sstable size, >>> you are not getting an accurate picture. This overhead has been reduced by >>> about 90% in the upcoming 2.1 release, through tickets 6271 >>> <https://issues.apache.org/jira/browse/CASSANDRA-6271>, 6689 >>> <https://issues.apache.org/jira/browse/CASSANDRA-6689> and 6694 >>> <https://issues.apache.org/jira/browse/CASSANDRA-6694>. >>> >>> >>> On 4 June 2014 10:49, Idrén, Johan <johan.id...@dice.se> wrote: >>> >>>> Hi, >>>> >>>> >>>> I'm seeing some strange behavior of the memtables, both in 1.2.13 and >>>> 2.0.7, basically it looks like it's using 10x less memory than it should >>>> based on the documentation and options. >>>> >>>> >>>> 10GB heap for both clusters. >>>> >>>> 1.2.x should use 1/3 of the heap for memtables, but it uses max ~300mb >>>> before flushing >>>> >>>> >>>> 2.0.7, same but 1/4 and ~250mb >>>> >>>> >>>> In the 2.0.7 cluster I set the memtable_total_space_in_mb to 4096, >>>> which then allowed cassandra to use up to ~400mb for memtables... >>>> >>>> >>>> I'm now running with 20480 for memtable_total_space_in_mb and >>>> cassandra is using ~2GB for memtables. >>>> >>>> >>>> Soo, off by 10 somewhere? Has anyone else seen this? Can't find a >>>> JIRA for any bug connected to this. >>>> >>>> java 1.7.0_55, JNA 4.1.0 (for the 2.0 cluster) >>>> >>>> >>>> BR >>>> >>>> Johan >>>> >>> >>> >> >