thanks!!
-Matthew On Tue, Aug 21, 2012 at 7:29 PM, David Yu <david.yu....@gmail.com> wrote: > > > On Wed, Aug 22, 2012 at 5:33 AM, Alexander Sicular <sicul...@gmail.com>wrote: > >> I was in the Riak 1.2 webinar earlier today and asked a leveldb question >> about insertion order and durability vs. bitcask's WOL architecture. Joe >> was not able to get to my question then but took the time to write me a >> detailed answer. Great engineers at Basho taking time to answer questions >> is a great thing. Thanks Joe! >> >> -Alexander Sicular >> >> @siculars >> >> Begin forwarded message: >> >> *From: *Joseph Blomstedt <j...@basho.com> >> *Subject: **LevelDB* >> *Date: *August 21, 2012 3:45:45 PM EDT >> *To: *sicul...@gmail.com >> >> Alexander, >> >> I noticed your LevelDB question in the webinar as Reem was closing >> things out, so I figured I'd follow up via email. >> >> As you know, Bitcask maintains a strict set of write-logs and an >> in-memory hash table that maps keys to (file, offset). Pretty >> straightforward. Compaction is a separate thing that happens based on >> independent triggers. >> >> LevelDB is rather different. LevelDB does maintain a WAL, but it's >> short-lived and only for crash recovery. LevelDB writes to the WAL, >> but also keeps the object in an in-memory write buffer (configurable >> size, increased in Riak 1.2 by 10x from Riak 1.1). After the buffer >> becomes full, LevelDB writes the data to disk as a Level-0 SST (data >> in sorted order + sorted index at the end of the file). >> >> There can be multiple Level-0 SSTs. To read a key, LevelDB looks at >> the index in each SST starting from newest file to oldest. For >> performance, there's an LRU cache of indexes so you're not always >> hitting disk. LevelDB now also includes bloom filters (used in Riak >> 1.2) to make it easier to skip non-interesting SSTs. >> >> To make things more efficient, LevelDB does compaction/merging in a >> background thread. A set of Level-0 files will be selected and merged >> together into a larger Level-1 file. The format is the same, but the >> file is now larger and includes the data from multiple Level-0 files. >> The original Level-0 files are then removed. Likewise, Level-1 files >> are merged into Level-2 files, and Level-2 into Level-3, etc. Each >> Level having larger files with a greater chunk of adjacent, sorted >> data. >> >> To read, you check newest to oldest on Level 0, then Level 1, then Level >> 2, etc. >> >> While compaction is a background thing, LevelDB limits the number of >> Level-0 files you can have. If you hit the limit, LevelDB will block >> writes until files have been merged into Level-1. With a single >> compaction thread, it was easy to max out LevelDB in Riak 1.1, and >> these stalls were fairly frequent and hurt 95% and up latencies, as >> well as greatly hurt throughput. Our change to use multiple compaction >> threads has greatly improved the how quickly compaction occurs, and >> writes rarely (if ever) end up stalling. To further improve things, >> there's the adaptive write throttling that I mentioned that will slow >> down writes (increased latency) in order to ensure compaction isn't >> heavily affected and remains ahead of write traffic -- thus, further >> preventing stalls. Net effect is somewhat higher latency and lower >> throughput that is more consistent (ie. 95%+ are tighter around >> average latency). >> >> I hope this answers your question. >> >> -Joe >> >> >> Thanks for sharing! > >> >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> > > > -- > When the cat is away, the mouse is alone. > - David Yu > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com