It makes sense, David. I'm going to give it a try. Hopefully this will make it usable for the next month until the issue is addressed.
I'll let you know how it goes. Thanks, Marco On 6 March 2012 15:19, David Smith <[email protected]> wrote: > On Mon, Mar 5, 2012 at 9:55 PM, Marco Monteiro <[email protected]> > wrote: > > > I'm using riak-js and the error I get is: > > > > { [Error: socket hang up] code: 'ECONNRESET' } > > That is a strange error -- are there any corresponding errors in > server logs? I would have expected a timeout or some such... > > > > > UUIDs. They are created by Riak. All my queries use 2i. The 2i are > integers > > (representing seconds) and random strings (length 16) used as identifiers > > for user sessions and similar. > > So, this explains why the problem goes away when you switch to an > empty bucket. A bit of background... > > If you're using the functionality in Riak that automatically generates > a UUID on PUT, you're going to get a uniformly distributed 160-bit > number (since the implementation SHA-1 hashes the input). This sort of > distribution is great for uniqueness, since there is a 1 in 2^160 > chance (roughly) that you will encounter another similar ID. It can be > very bad from a caching perspective, however, if you have a cache that > uses pages of information for locality purposes. In a scheme such as > this (which is what LevelDB uses), the system will wind up churning > the cache constantly since the odds are quite low that the next UUID > to be accessed will be already in memory (remember, uniform > distribution of keys). > > LevelDB also makes this pathological case a bit worse by not having > bloom filters -- when inserting a new UUID, you will potentially have > to do 7 disk seeks just to determine if the UUID is not present. The > Google team is working to address this problem, but I'm guessing it'll > be a month or so before that's done and then we have to integrate with > Riak -- so we can't count on that just yet. > > Now, all is not lost. :) > > If you craft your keys so that there is some temporal locality _and_ > the access pattern of your keys has some sort of exponential-ish > decay, you can still get very good performance out of LevelDB. One > simple way to do this is to prefix the current date-time on front of > the UUID, like so: > > 201203060806-<uuid> (YMDhm-UUID) > > You could also use seconds since the epoch, etc. This has the effect > of keeping recently accessed/hot UUIDs on (close to) the same cache > page, and lets you avoid a lot of cache churn and typically > dramatically improves LevelDB performance. > > Does this help/make sense? > > D. > -- > Dave Smith > VP, Engineering > Basho Technologies, Inc. > [email protected] >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
