Hi Michael,

Greg's advice is probably the best, if you really always want to read back or update predefined groups of 1000 keys at once. It will increase the rate at which you can write and read by a factor of 1000 ;-).

But if that's not what you want to do, and we really don't know what your design goals are, I honestly think you are trying to put a screw in with a hammer here. Maybe you should look for alternatives to Riak, since you exploit all its weaknesses an don't care about most of its strengths.

Namely storing very small values is a weak spot, as Greg mentioned. There is an overhead of at least around 400 bytes per entry at the moment. Even if there are plans to reduce this overhead, I would estimate it will never get below around 100 bytes if the thing is still Riak afterwards. Also this overhead exists for all storage backends. So with the ets backend you will only be able to store about 2-3 million entries per GB of RAM right now.

Which brings me to the part where you don't care/use most of Riak's strengths. You don't seem to care about persistence of data, otherwise you wouldn't use a memory only backend. (Btw, as Mike pointed out, with enough RAM bitcask is essentially a memory store, especially where the write performance is concerned) You also don't care about eventual consistency, evidenced by the fact that you do bulk inserts (only?), and that 12 bytes wouldn't allow for enough information to resolve conflicts. So you probably want a last write wins behaviour (which can be set as a bucket property in Riak, but kind of defeats the purpose in my opinion).

But lets assume Riak was the right tool for your job for a moment.
The limiting factor for writing your data is almost certainly not the disk. Writing 100,000 keys with a size of 12 bytes requires only about 1MB/s, so event the crappiest disk should have no problem with that. But as I said there is quite a large overhead for storing values in Riak, so in reality the required rate will be 50MB/s per node (3 nodes, n=3 presumably). Still not a big deal, and this only is a limiting factor once the filesystem cache uses all available RAM.

On the other hand, network latency is a problem at such high rates, even in a LAN. As far as my experience an my short Google research tell me, that the lowest roundtrip time you can expect on standard Gigabit ethernet is on the order of 0.1msec or 1/10000 second. For each operation you need at least one roundtrip (one request packet, one reponse packet), so that means with one connection you can never go beyond 10,000 writes per second. This assumes no processing time whatsoever, so a more realistic number is 2000-5000 ops/s. Therefore you need at least 20-50 parallel connections or clients to achieve your target write rate. If you use the Rest API these numbers need to be doubled, since one additional roundtrip is already need to set up the TCP connection. In general without a lot of tuning and maybe specialized hardware (multiple NICs or special low latency NICs) any server will have a hard time to handle 100,000 ops/s, regardless of the software that is used.


Cheers,
Nico


On 28.05.2011 20:36, Greg Nelson wrote:
Depending on the n_val you have set for that bucket, Riak will store the
objects n times on n different nodes. There are two other parameters you
should know about, r and w. When writing, Riak will wait for w of the n
nodes to finish the write before returning. When reading, Riak will wait
for r of the n nodes to respond before returning. This is the basics of
how Riak does fault and partition tolerance, i.e. if one node is down
your cluster still functions, and the r and w vals define a sort of
"majority vote" threshold to handle a split-brain problem.

Anyway, for your purposes you could set w=1 and r=3 for faster writes at
the expense of potentially slower reads. I've never tried this (or any
of the backends besides bitcask) so I don't know what you should expect.

As for bulk insert and preserving locality, I don't know of a way to do
that with Riak except to batch your 1000 keys into a single object,
identified by one key. As far as Riak is concerned, it's just a 12KB
opaque object, which your application would need to always write and
read all at once.

If you don't batch like that, you should look for a discussion on this
mailing list from last week regarding capacity planning and very small
objects. There's a bit of overhead associated with each object that will
be significant for objects as small as 12 bytes. You could skip over the
parts about Bitcask overhead...

On Saturday, May 28, 2011 at 9:59 AM, Michael McClain wrote:

Thank you, Mike and Greg, for the response.
I've just replied to the list.
In my use case, I need to be able to write 100,000 keys per second.
Where the key is very small (12 bytes). And I always insert 1000 keys
at once, in a bulk insert. I would also like to preserve the locality
of the keys inserted at once (so that they stay always in the same
node). Do you know if that is possible?

Thank you

2011/5/28 Mike Oxford <moxf...@gmail.com <mailto:moxf...@gmail.com>>
With enough RAM you could just have it keep the whole thing in
disk-cache...

-mox


On Fri, May 27, 2011 at 11:11 PM, Greg Nelson <gro...@dropcam.com
<mailto:gro...@dropcam.com>> wrote:
Michael,

You might want to check out riak_kv_ets_backend,
riak_kv_gb_trees_backend, and riak_kv_cache_backend.

http://wiki.basho.com/Configuration-Files.html

<http://wiki.basho.com/Configuration-Files.html>-Greg

On Friday, May 27, 2011 at 10:35 PM, Michael McClain wrote:

Hi,

Is it possible to store the whole database in memory?
In a similar way as Redis does.

I'm really interested in the distributed map reduce done by riak
("bring processing to the data, instead of data to processors), but
I need faster writes/reads that a memory-only database could provide.
In case you don't support memory-only storage (no disk touched /
all keys and data fitting the memory in all nodes) yet, do you plan
on implementing it?

Thank you,
Michael
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to