On Thu, Nov 3, 2011 at 02:39, Justin Karneges <jus...@affinix.com> wrote: >... > Say you have an operation that requires creating two keys, A and B, and you > succeed in creating A but fail in creating B. How do you delete A after the > fact? I have two ideas: > > 1) Run periodic MapReduce operations that do full db scans looking for garbage > keys and deleting them (this seems really horrible, but I'll admit I'm new to > distributed DBs and MapReduce).
I believe that you will *always* need to do this. Without transactions, you can always end up with cruft. Best you can do is minimize how often you need to run the scavenge process. > 2) Maintain cleanup logs that explicitly identify possibly offending keys, for > optimized cleanup processing. These logs need to be stored *somewhere*, but that storage could also fail. That is why I believe you'll need a periodic full scan for garbage. (and note this applies whether "storage" is memory, disk, Riak, or whatever else) >... > So far so good. Now for handling cleanup. Periodically, we scan the > "cleanup" bucket for keys to process. Since keys only exist in this bucket at > the moment of a write (they are deleted immediately afterwards), in practice > there should hardly be any keys in here at any single point in time. We're > talking single digits here. Much better than a full db scan to find garbage > keys. Also, the keys to process can be narrowed down by time (e.g. > 5 > minutes ago) based on the key name. This will minimize your scans, but not eliminate them. You may not be able to write to the "cleanup" bucket because you've lost all network connectivity to the Riak cluster. Not a bad assumption, given that you could not write out B (what makes you think you could write to "cleanup"?). Personally, rather than attempting to write something else to a failing Riak cluster, I'd suggest keeping these keys in memory along with a background thread that periodically attempts to clean them up. You're gonna lose the keys if the client dies, but hey... as I said: best you can do is to minimize the full scans. >... Cheers, -g _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com