So if you will succeed with all your patches the memory overhead will decrease by 22 (=16+4+2) bytes, am I right?
On 5 August 2013 16:38, Evan Vigil-McClanahan <emcclana...@basho.com> wrote: > Before I'd done the research, I too thought that the overheads were a > much lower, near to what the calculator said, but not too far off. > > There are a few things that I plan on addressing this release cycle: > - 16b per-allocation overhead from using enif_alloc. This allows us > a lot of flexibility about which allocator to use, but I suspect that > since allocation speed isn't a big bitcask bottleneck, this overhead > simply isn't worth it. > - 13b per value overhead from naive serialization of the bucket/key > value. I have a branch that reduced this by 11 bytes. > - 4b per value overhead from a single bit flag that is stored in an > int. No patch for this thus far. > > Additionally, I've found that running with tcmalloc using LD_PRELOAD > reduces the cost for bitcask's many allocations, but a) I've never > done so in production and b) they say that it never releases memory, > which is worrying, although the paging system theoretically should > take care of it fairly easily as long as their page usages isn't > insane. > > My original notes looked like this: > > 1) ~32 bytes for the OS/malloc + khash overhead @ 50M keys > (amortized, so bigger for fewer keys, smaller for more keys). > 2) + 16 bytes of erlang allocator overhead > 3) + 22 bytes for the NIF C structure > 4) + 8 bytes for the entry pointer stored in the khash > 5) + 13 bytes of kv overhead > > tcmalloc does what it can for line 1. > My patches do what I can for lines 2, 3, and 5. > > 4 isn't amenable to anything other than a change in the way the keydir > is stored, which could also potentially help with 1 (fewer > allocations, etc). That, unfortunately, is not very likely to happen > soon. > > So things will get better relatively soon, but there are some > architectural limits that will be harder to address. > > On Mon, Aug 5, 2013 at 1:49 AM, Alexander Ilyin <alexan...@rutarget.ru> > wrote: > > Evan, > > > > News about per key overhead of 91 bytes are quite frustrating. When we > were > > choosing a key value storage per key metadata size was a crucial point > for > > us. We have a simple use case but a lot of data (hundreds of millions of > > items) so we were looking for the ways to reduce memory consumption. > > Here and here is stated a value of 40 bytes. 22 bytes in ram calculator > > seemed like a mistake because the following example obviously uses a > value > > of 40. > > > > Anyway, thanks for your response. > > > > > > On 4 August 2013 04:39, Evan Vigil-McClanahan <emcclana...@basho.com> > wrote: > >> > >> Some responses inline. > >> > >> On Fri, Aug 2, 2013 at 3:11 AM, Alexander Ilyin <alexan...@rutarget.ru> > >> wrote: > >> > Hi, > >> > > >> > I have a few questions about Riak memory usage. > >> > We're using Riak 1.3.1 on a 3 node cluster. According to bitcask > >> > capacity > >> > calculator > >> > > >> > ( > http://docs.basho.com/riak/1.3.1/references/appendices/Bitcask-Capacity-Planning/ > ) > >> > Riak should use about 30Gb of RAM for out data. Actually, it uses > about > >> > 45Gb > >> > and I can't figure out why. I'm looking at %MEM column in top on each > >> > node > >> > for a beam.smp process. > >> > >> I've recently done some research on this and have filed bugs against > >> the calculator, it's a bit wrong and has been that way for a while: > >> > >> https://github.com/basho/basho_docs/issues/467 > >> > >> The numbers there look a bit closer to what you're seeing. > >> > >> The good news is that I am looking into reducing memory consumption > >> this development cycle and our next release should see some > >> improvements on that front. The bad news is that it may be a while. > >> If you want to watch the bitcask repo on github to see when these > >> changes go in, it's usually pretty easy to build a new bitcask and > >> replace the one that you're running. > >> > >> > Disk usage is also about 1,5 times more than I have expected (270Gb > >> > instead > >> > of 180Gb). I rechecked that I have n_val=2 (not 3), it seems alright. > >> > Why > >> > this could happen? > >> > >> There is definitely some overhead on the stored values, especially > >> when you're using bitcask. How big are your values? Overheads, if I > >> recall correctly, run to a few hundred bytes, but I'll have to ask > >> some people to refresh my memory. > >> > >> > Second question is about performance degradation when Riak uses almost > >> > all > >> > available memory on the node. We see that 95/99 put percentiles twice > as > >> > large for nodes which don't have much free RAM. How much free memory I > >> > should have to keep performance high? > >> > >> I don't have a good answer for this; when I was working as a CSE we > >> generally urged people to start adding nodes when their most limited > >> resource (memory, disk, cpu, etc) was 70-80% utilized (as a grossly > >> oversimplified rule of thumb). > >> > >> > And the last question about memory_total metric. riak-admin status > >> > returns > >> > value which is less than actual memory consumption as seen in the top. > >> > According to memory_total description > >> > > >> > ( > http://docs.basho.com/riak/1.3.1/references/appendices/Inspecting-a-Node/) > >> > they should be equal. Why they are not? > >> > >> Top factors in OS/libc overheads that memory_total cannot see. I'll > >> check out the docs and get them amended if they're wrong. > > > > >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com