Just curious if anyone has any ideas, for the moment, I'm just taking
the RAM calculation and multiplying by 2 and the Disk calculation and
multiplying by 8, based on my findings with my current cluster.  But
I would like to know why my values are so much higher than those I should
be getting.

Also, I'd still like to know how the forms calculate things as the disk
calculation there does not match reality or the formula.

Also, waiting to hear if there is any way to force merge to run so I can
more accurately gauge whether multiple copies are effecting disk usage.

Thanks,

-Anthony

On Mon, May 23, 2011 at 11:06:31PM -0700, Anthony Molinaro wrote:
> 
> On Mon, May 23, 2011 at 10:53:29PM -0700, Anthony Molinaro wrote:
> > 
> > On Mon, May 23, 2011 at 09:57:25PM -0600, David Smith wrote:
> > > On Mon, May 23, 2011 at 9:39 PM, Anthony Molinaro
> > > Thus, depending on
> > > your merge triggers, more space can be used than is strictly necessary
> > > to store the data.
> > 
> > So the lack of any overhead in the calculation is expected?  I mean
> > according to http://wiki.basho.com/Cluster-Capacity-Planning.html
> > 
> > Disk = Estimated Total Objects * Average Object Size * n_val
> > 
> > Which just seems wrong, doesn't it?  I don't quite understand the
> > bitcask code well enough yet to see what the actual data it stores is,
> > but the whitepaper suggested several things were involved in the on
> > disk representation.
> 
> Okay, finally found the code for this part, I kept looking in the nif
> but that's only the keydir, not the data files.  It looks like
> 
>    %% Setup io_list for writing -- avoid merging binaries if we can help it
>    Bytes0 = [<<Tstamp:?TSTAMPFIELD>>, <<KeySz:?KEYSIZEFIELD>>,
>              <<ValueSz:?VALSIZEFIELD>>, Key, Value],
>    Bytes  = [<<(erlang:crc32(Bytes0)):?CRCSIZEFIELD>> | Bytes0],
> 
> And looking at the header, it seems that there's 14 bytes of overhead
> (4 for CRC, 4 for timestamp, 2 for keysize, 4 for valsize).
> 
> So disk calculation should be
> 
> ( 14 + Key + Value ) * Num Entries * N_Val
> 
> So using my numbers from before that gives
> 
> ( 14 + 36 + 36 ) * 183915891 * 3 = 47450299878 = 44.1 GB
> 
> which actually isn't much closer to 341 GB than the previous calculation :(
> 
> So all my questions from the previous email still apply.
> 
> -Anthony
> 
> -- 
> ------------------------------------------------------------------------
> Anthony Molinaro                           <antho...@alumni.caltech.edu>

-- 
------------------------------------------------------------------------
Anthony Molinaro                           <antho...@alumni.caltech.edu>

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to