Hi, As I'm about to dramatically increase our riak investment by putting lots more data into it. I figured I might try to run through the capacity planning on the wiki.
Since my current setup is fairly small and manageable I decided to try to see how accurately the capacity planning matches what I see. So first reality A: Number of Machine : 8 B: Memory per Machine : 24 GB C: Length of Bucket Name: 10 bytes D: Length of Keys : 36 bytes E: Length of Values : 36 bytes F: Replication Factor : 3 G: Number of Keys : 183915891 H: Disk Space used : 341898018816 bytes (341 GB) I: RAM : 70536691712 bytes (70 GB) G was calculated using riak_kv_bitcask_backend:key_counts/0 for each bitcask on a node, summing, then dividing by 3 H was calculated with 'du -sk /var/lib/riak/bitcask/ | cut -f1', summing and multiplying by 1024 I was caluclated with 'ps -U riak -o vsz h', summing and multiplying by 1024 Now from entering A-G on the Bitcask-Capacity-Planning page I get Total Key Space: 34.9 GB Node Count : 3 (7 GB Storage per Node) in the first section and Key Overhead: 73 Bytes (22 Byte Overhead) Total Documents: 1,010,580,541 Total Disk Used: 102 GB of Disk Space Also when using the Cluster Capacity Planning page I get (static bitcask per key overhead + estimated average bucket+key length in bytes) * estimate total number of keys * n_val = Approximate RAM Needed for Bitcask So plugging in values ( 22 + 10 + 36 ) * 183915891 * 3 = 37518841696 = 34.9 GB and Disk = Estimated Total Objects * Average Object Size * n_val Disk = 183915891 * 36 * 3 = 19862916228 = 18.49 GB So either the equations are drastically wrong or my calculations are. I find it very suspect that the equation for the amount of disk includes zero overhead when reading the bitcask paper it seems like each entry consists of CRC, timestamp, keysz, valsz, key, value Well anyway, there's obviously something off, as I end up with the following Bitcask-Capacity-Planning Cluster-Capacity-Planning Reality RAM 34.9 GB 34.9 GB 70 GB Disk 102 GB 18.49 GB 341 GB So it looks to me like the numbers for RAM are about 1/2 of actual and the number for Disk are completely off, they are different depending on which page you look at on the wiki and vastly underestimate reality. I'm hoping someone from basho can clarify so I can really determine capacity. Thanks, -Anthony -- ------------------------------------------------------------------------ Anthony Molinaro <antho...@alumni.caltech.edu> _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com