Date: Tue, 21 May 2013 07:28:46 -0400
From: Greg Burd <g...@basho.com>

Matthew's right, and I should have been more clear on a few points:

* My "tests" were smoke-tests at best.  Far too short and far too basic to
really determine anything.  That's clear from the results and Matthew's
points are all valid.  When bitcask and LevelDB look similar in a test, the
test is wrong.

* I should have made it clear that I in my weekend of toying with this code
from OpenLDAP I only did enough work to a) make it async, b) build a simple
basho_bench driver, c) clean up the code a bit but what I didn't do was
build a full Riak integration layer.  This was just me testing something
out over the weekend because I was curious, nothing more.

* I think there is a lot of interesting features in LMDB, but it would take
some time before it could even be considered as a viable backend for
Riak/KV storage and then someone would need to write a backend layer for
it.  Our experiences with LevelDB and Matthew's hard work (as he
brilliantly outlined in his RICON/East talk last week) shows just how far
it is from the *experimental* testing stage to a production quality stage.

* I'm not working on building a Riak/KV integration or working on LMDB for
use in Riak and in no way do I (or Basho) endorse it as a viable option for
use in our product.  This is just too much work for uncertain payoff at
this point, the experiments I ran were just for fun.

The engineers on OpenLDAP write good, solid code for their very specific
use case which has a few fundamental differences from Riak/KV one being
that LDAP server access patterns are generally 90/10 read/write where as
Riak/KV is at least 50/50 but more commonly heavily weighted toward writes.

So, apologies if I've confused anyone.  Brian, I think you should push on
LMDB a bit more and see what you think, maybe it could be awesome at some
point in the future but there is a lot of hard work between here and there!

best,

-greg

Greg, thanks for taking the time to test the code and explain your work. I should note that OpenLDAP's initial interest in Riak is because we're looking for a clustered data store to replace MySQL NDBCluster which we currently support. But stores like Bitcask and LevelDB are not suitable for an LDAP workload, which is why we were interested in plugging our backing store into Riak.


@gregburd | Basho Technologies | Riak | http://basho.com | @basho


On Tue, May 21, 2013 at 4:54 AM, Matthew Von-Maszewski
<matth...@basho.com>wrote:

Greg,

Wait.  Couple serious environmental issues here:

- " results are close enough to LevelDB, Bitcask, ?":  Bitcask is always
1.5x to 2x the performance of LevelDB.  Bitcask has a constant throughput
until its first merge.  Your comment states all the databases are close.
  Bitcask and Leveldb are never close in real life.  Sounds like your
testing was either CPU bound or some other environmental bottleneck.  This
suggests to me that you might have achieved a similar throughput with 3x5
notecards too.

- "levels out at 2500 ops ?" within 10 minutes per your attached graphs.
  Stock LevelDB from Google does not start to seriously degrade until 90
minutes of constant load.  Yet the performance graphs offered have tanked
within 10 minutes.

Bad data does not motivate me to spend time with yet another offering in
the key/value storage space.  Comparison of 20 hour runs, maybe.  But those
benchmarks need to validate known performance differences of Bitcask and
Leveldb to establish credibility. 10 minutes runs have barely flushed
memory caches.

Matthew

On May 19, 2013, at 12:38 AM, Greg Burd <g...@basho.com> wrote:

Hello Brian,

I want to echo all the things Jon, Kresten and others have said, but I
know Howard well (from my time back at Sleepycat) and I know he doesn't
mess around when it comes to storage code.  So I took the day today to make
a more updated NIF https://github.com/basho-labs/lmdb and run some tests.
  These are on my laptop (SSD, Linux, btrfs, 2 core/4 thread Core i7 with
8GB RAM -- aka, not very server-like).  The one that levels out at 2500 ops
is a put only workload.  The other is a mix of get/put/del operations.  The
results are close enough to LevelDB, Bitcask, and WiredTiger on my laptop
to warrant more work at some point.

This doesn't suggest that this will become supported, just that someone
cared enough to take the code for a spin.

best,

@gregburd | Basho Technologies | Riak | http://basho.com | @basho


@gregburd | Basho Technologies | Riak | http://basho.com | @basho


On Fri, May 3, 2013 at 2:05 PM, Jon Meredith <jmered...@basho.com> wrote:

Hi Brian,

Experimental backends for Riak are always exciting.  I haven't played
with it personally, and Basho has no current plans to support it as a
leveldb alternative.

It's worth adding two notes of caution.  First, stores that use mmap for
persistence can suffer from problems around dirty pages.  If you have a
very low update volume and a nice hot set that the operating system can
keep the pages in memory they work nicely.

On some operating systems (specifically Linux), if you have a high update
load, and consequently a large volume of dirty pages (more than
dirty_ratio), I believe all OS level threads for the process are suspended
until the condition is resolved by writing out the pages when the process
is scheduled.

This is bad for latencies in endurance tests.   For something like LDAP
this tradeoff is probably a good one, for Riak it concerns me, and other
platforms may be better behaved to make the option interesting, and the
linux kernel may have begun

Second, I worry about crash resilience - can the internal memory
structures tolerate a kernel panic where the dirty pages are not written,
or potentially worse torn with a partial write.

Good luck with your experiments,

Jon



On Wed, May 1, 2013 at 11:05 AM, Brian Hong <seri...@serialx.net> wrote:

OpenLDAP Lightning Memory-Mapped Database seems to be getting traction
for it's high performance and similar query (iteration) functionality with
leveldb:
http://symas.com/mdb/

There seems to be an experimental backend for Riak:

https://github.com/alepharchives/emdb

Does anybody know of it's usefulness? Is there any benchmarks on it?

I've heard that leveldb suffers write performance issues:
https://news.ycombinator.com/item?id=5621884

Any chances of Basho guys supporting lmdb as an leveldb alternative to
Riak? It would be awesome! :D

--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to