----- Original Message ----- From: "Andrew Stone" <ast...@basho.com> To: "Jason Campbell" <xia...@xiaclo.net> Cc: "Sean Cribbs" <s...@basho.com>, "riak-users" <riak-users@lists.basho.com>, "Viable Nisei" <vsni...@gmail.com> Sent: Saturday, 21 December, 2013 10:01:29 AM Subject: Re: May allow_mult cause DoS?
> Think of an object with thousands of siblings. That's an object that has 1 > copy of the data for each sibling. That object could be on the order of 100s > of megabytes. Everytime an object is read off disk and returned to the client > 100mb is being transferred. Furthermore leveldb must rewrite the entire 100mb > to disk everytime a new sibling is added. And it just got larger with that > write. If a merge occurs, the amount of data is a single copy of the data at > that key instead of what amounts to approximately 10000 copies of the same > sized data, when all you care about is one of those 10,000. This makes sense for concurrent writes, but the use case that was being talked about was siblings with no parent object. In that case, there shouldn't be much difference at all, since each sibling is just the data that was inserted. I understand the original use case being discussed was tens of millions of objects, and the metadata alone would likely exceed recommended object sizes in Riak. I've mentioned my use case before, which is trying to get fast writes on large objects. I abuse siblings to some extent, although by the nature of the data, there will never be more than a few thousand small siblings (under a hundred bytes). I merge them on read and write the updated object back. Even with sibling metadata, I doubt the bloated object is over a few MB, especially with snappy compression which handles duplicate content quite well. Even if Riak merges the object on every write, it's still much faster than transferring the whole object over the network every time I want to do a small write. Is there a more efficient way to do this? I thought about writing single objects and using a custom index, but that results in a read and 2 writes, and the index could grow quite large compared to the amount of data I'm writing. Thanks, Jason _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com