On 2/23/2011 10:39 AM, Ryan Zezeski wrote:
Thanks! A couple more somewhat related questions: is that atomic update nature hard to duplicate outside of luwak (say by a client that needs to keep several items in sync), and if the luwak blocks are immutable, how do you ever clean up the space used by data that has been deleted or modified and no longer referenced? For your first question, I believe the onus to make multiple object updates atomic is on you, the application developer. One of the, perhaps easier, ways to achieve this would be to wrap all the data in one object?
And luwak accomplishes this by putting the list of keys comprising the whole stream in one object that is updated last, so a reader will get one or the other?
Second, you don't; not at this time at least. Luwak allows you to delete the file reference, but not the data itself. It's the very nature of the fact that it's an immutable, persistent data structure that makes this so. If two files share a block, then you can't simply delete the blocks under a file, but instead must perform something more like garbage collection.
If I understand what is going on correctly, you'd have to maintain a reference count atomically with the keys since files with duplicate sections would reuse some data blocks. Hmmm, that makes it sound like a really good place to throw backups for a dedup effect...
If you're up for it, I have some proof of concept code on my fork of Luwak. I got GC to work, to an extent. IIRC, once I got past 10-15GB things started to degrade quickly. When I get more time I plan to return to it.
What is the trick to knowing if a block is currently referenced or not? Would it be possible to have some sort of bucket versioning and periodically copy currently-referenced blocks forward, flip the bucket reference and drop everything in the old bucket? I guess you'd still have to deal with possible re-use during the copy.
-- Les Mikesell lesmikes...@gmail.com _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com