Ok, thank you! Looking forward to the next release.

On 6 August 2013 17:32, Evan Vigil-McClanahan <emcclana...@basho.com> wrote:

> 11 + 4 + 16, so 31.
>
> 18 bytes there are the actual data, so that can't go away.  Since the
> allocation sized are going to be word aligned, the least overhead
> there is going to be word aligning the entire structure, i.e. where
> (key_len + bucket_len + 2 + 18) % 8 == 0, but that sort of
> optimization only works with fixed length keys.
>
> Khash overheads are expected to be small-ish, but are
> under-researched, at least by me.  I suspect most of the overhead is
> coming from the allocator.   So moving to tcmalloc is a possible win
> there, because it does a better job keeping amortized per-allocation
> overheads low for small allocations than libc's malloc, but of course
> with the caveats mentioned in my last email (tl;dr test
> *exhaustively*, because we don't and likely won't).
>
> Another possible improvement would be to move to a fixed-length
> structure (that points to an allocated location for oversized
> key-bucket binaries, but that has a very bad pathological case where
> someone selects all keys larger than your fixed size, where you have
> the fixed len - 8 as an additional overhead.
>
> On Tue, Aug 6, 2013 at 2:56 AM, Alexander Ilyin <alexan...@rutarget.ru>
> wrote:
> > So if you will succeed with all your patches the memory overhead will
> > decrease by 22 (=16+4+2) bytes, am I right?
> >
> >
> > On 5 August 2013 16:38, Evan Vigil-McClanahan <emcclana...@basho.com>
> wrote:
> >>
> >> Before I'd done the research, I too thought that the overheads were a
> >> much lower, near to what the calculator said, but not too far off.
> >>
> >> There are a few things that I plan on addressing this release cycle:
> >>   - 16b per-allocation overhead from using enif_alloc.  This allows us
> >> a lot of flexibility about which allocator to use, but I suspect that
> >> since allocation speed isn't a big bitcask bottleneck, this overhead
> >> simply isn't worth it.
> >>   - 13b per value overhead from naive serialization of the bucket/key
> >> value.  I have a branch that reduced this by 11 bytes.
> >>   - 4b per value overhead from a single bit flag that is stored in an
> >> int. No patch for this thus far.
> >>
> >> Additionally, I've found that running with tcmalloc using LD_PRELOAD
> >> reduces the cost for bitcask's many allocations, but a) I've never
> >> done so in production and b) they say that it never releases memory,
> >> which is worrying, although the paging system theoretically should
> >> take care of it fairly easily as long as their page usages isn't
> >> insane.
> >>
> >> My original notes looked like this:
> >>
> >> 1)   ~32 bytes for the OS/malloc + khash overhead @ 50M keys
> >> (amortized, so bigger for fewer keys, smaller for more keys).
> >> 2) + 16 bytes of erlang allocator overhead
> >> 3) + 22 bytes for the NIF C structure
> >> 4) +  8 bytes for the entry pointer stored in the khash
> >> 5) + 13 bytes of kv overhead
> >>
> >> tcmalloc does what it can for line 1.
> >> My patches do what I can for lines 2, 3, and 5.
> >>
> >> 4 isn't amenable to anything other than a change in the way the keydir
> >> is stored, which could also potentially help with 1 (fewer
> >> allocations, etc).  That, unfortunately, is not very likely to happen
> >> soon.
> >>
> >> So things will get better relatively soon, but there are some
> >> architectural limits that will be harder to address.
> >>
> >> On Mon, Aug 5, 2013 at 1:49 AM, Alexander Ilyin <alexan...@rutarget.ru>
> >> wrote:
> >> > Evan,
> >> >
> >> > News about per key overhead of 91 bytes are quite frustrating. When we
> >> > were
> >> > choosing a key value storage per key metadata size was a crucial point
> >> > for
> >> > us. We have a simple use case but a lot of data (hundreds of millions
> of
> >> > items) so we were looking for the ways to reduce memory consumption.
> >> > Here and here is stated a value of 40 bytes. 22 bytes in ram
> calculator
> >> > seemed like a mistake because the following example obviously uses a
> >> > value
> >> > of 40.
> >> >
> >> > Anyway, thanks for your response.
> >> >
> >> >
> >> > On 4 August 2013 04:39, Evan Vigil-McClanahan <emcclana...@basho.com>
> >> > wrote:
> >> >>
> >> >> Some responses inline.
> >> >>
> >> >> On Fri, Aug 2, 2013 at 3:11 AM, Alexander Ilyin <
> alexan...@rutarget.ru>
> >> >> wrote:
> >> >> > Hi,
> >> >> >
> >> >> > I have a few questions about Riak memory usage.
> >> >> > We're using Riak 1.3.1 on a 3 node cluster. According to bitcask
> >> >> > capacity
> >> >> > calculator
> >> >> >
> >> >> >
> >> >> > (
> http://docs.basho.com/riak/1.3.1/references/appendices/Bitcask-Capacity-Planning/
> )
> >> >> > Riak should use about 30Gb of RAM for out data. Actually, it uses
> >> >> > about
> >> >> > 45Gb
> >> >> > and I can't figure out why. I'm looking at %MEM column in top on
> each
> >> >> > node
> >> >> > for a beam.smp process.
> >> >>
> >> >> I've recently done some research on this and have filed bugs against
> >> >> the calculator, it's a bit wrong and has been that way for a while:
> >> >>
> >> >> https://github.com/basho/basho_docs/issues/467
> >> >>
> >> >> The numbers there look a bit closer to what you're seeing.
> >> >>
> >> >> The good news is that I am looking into reducing memory consumption
> >> >> this development cycle and our next release should see some
> >> >> improvements on that front.  The bad news is that it may be a while.
> >> >> If you want to watch the bitcask repo on github to see when these
> >> >> changes go in, it's usually pretty easy to build a new bitcask and
> >> >> replace the one that you're running.
> >> >>
> >> >> > Disk usage is also about 1,5 times more than I have expected (270Gb
> >> >> > instead
> >> >> > of 180Gb). I rechecked that I have n_val=2 (not 3), it seems
> alright.
> >> >> > Why
> >> >> > this could happen?
> >> >>
> >> >> There is definitely some overhead on the stored values, especially
> >> >> when you're using bitcask. How big are your values?  Overheads, if I
> >> >> recall correctly, run to a few hundred bytes, but I'll have to ask
> >> >> some people to refresh my memory.
> >> >>
> >> >> > Second question is about performance degradation when Riak uses
> >> >> > almost
> >> >> > all
> >> >> > available memory on the node. We see that 95/99 put percentiles
> twice
> >> >> > as
> >> >> > large for nodes which don't have much free RAM. How much free
> memory
> >> >> > I
> >> >> > should have to keep performance high?
> >> >>
> >> >> I don't have a good answer for this; when I was working as a CSE we
> >> >> generally urged people to start adding nodes when their most limited
> >> >> resource (memory, disk, cpu, etc) was 70-80% utilized (as a grossly
> >> >> oversimplified rule of thumb).
> >> >>
> >> >> > And the last question about memory_total metric. riak-admin status
> >> >> > returns
> >> >> > value which is less than actual memory consumption as seen in the
> >> >> > top.
> >> >> > According to memory_total description
> >> >> >
> >> >> >
> >> >> > (
> http://docs.basho.com/riak/1.3.1/references/appendices/Inspecting-a-Node/)
> >> >> > they should be equal. Why they are not?
> >> >>
> >> >> Top factors in OS/libc overheads that memory_total cannot see.  I'll
> >> >> check out the docs and get them amended if they're wrong.
> >> >
> >> >
> >
> >
>
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to