Istvan,

"block_size" is not a "size", it is a threshold.  Data is never split across 
blocks.  A single block contains one or more key/value pairs.  leveldb starts a 
new block only when the total size of all key/values in the current block 
exceed the threshold.  

Your must set block_size to a multiple of your typical key/value size if you 
desire multiple per block.

Plus side:  block_size is computed before compression.  So, you might get nice 
reduction in total disk size by having multiple, mutually compressible items in 
a block.  leveldb iterators / Riak 2i might give you slightly better 
performance with bigger blocks because there are fewer reads if the keys needed 
are in the same block (or fewer blocks).

Negative side:  the entire block, not single key/value pairs, go into the block 
cache uncompressed (cache_size).  You can quickly overwhelm the block cache 
with lots of large blocks.  Also random reads / Gets have to read, decompress, 
and CRC check the entire block.  Therefore it costs you more disk transfer and 
decompression/CRC CPU time to read random values from bigger blocks.


I suggest you experiment with your dataset and usage patterns.  Be sure to 
build big sample datasets before starting to measure and/or restart Riak 
between building and measuring.  These are ways to make sure you see the impact 
of random reads.

Matthew


On Aug 13, 2013, at 2:51 PM, István <lecc...@gmail.com> wrote:

> Hi guys,
> 
> I am setting up a new Riak cluster and I was wondering if there is any 
> drawback of increasing the LevelDB blocksize from 4K to 64K. The reason is 
> that we have all of the values way bigger than 4K and I guess from the 
> performance point of view it would make sense to increase the block size. The 
> tests are still running to confirm this theory but I wanted to clarify that 
> there is no big red flag of doing that from the Riak side. I found the 
> following discussion about changing block size:
> 
> https://groups.google.com/forum/#!msg/leveldb/2JJ4smpSC6Q/1Z7aDSeHiRkJ
> 
> Is that a good idea to experiment with this in Riak to achieve better 
> performance?
> 
> Thank you in advance,
> Istvan
> 
> 
> -- 
> the sun shines for all
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to