Trade offs. I dislike rewriting stuff that doesn't scale. I love the idea of just throwing another box into a cluster and having it "just work" without rebalancing issues, etc. I'm tired of dealing with shards, complex replication setups, etc.
I will also be using heavily for logging, URL shortening, social share registering, etc., and I'd rather stick to one datastore. I agree it's a bit of a round peg in a ovoid hole when it comes to geospatial + mapreduce queries, but I imagine it won't be long until geospatial indexes arrive at Riak as well, and then I can switch to those. :-) -Mark On Tue, May 1, 2012 at 2:25 PM, Alexander Sicular <sicul...@gmail.com>wrote: > Hey, I'm as up for a good and clever hack as anybody. But the question is > just because you can, should you? Who will maintain your hack after your'e > dead? I'm still maintaing crap I wrote years ago. Even though I'm paid, > sometimes I would rather not have the headache. Why would you use a product > that specifically does not support such hackery? Scaling postgres or mongo > are known and solvable problems especially concerning bounded data sets, > likes, say, all points on a globe. Now if you were storing checkins, that > would be a different problem. One suitable for, say, Riak. > > On Tue, May 1, 2012 at 14:09, Mark Rose <markr...@markrose.ca> wrote: > >> Well, I'd be indexing items over the entire globe. I'd be be looking at >> resolutions from an entire world view down to city block. I'm thinking of >> using geohashes as an index to restrict the result set, then further >> filtering and sorting by mapreducing the remaining items. So I only need >> enough granularity to reduce the number of items to a reasonable amount. At >> the world view level, I'd filter out most results using mapreduce, but the >> local-level queries would be far more common so an index would be highly >> advantageous. The geometry I'd want to query would be a window that >> arbitrarily overlaps one or more geohash regions. Basically, think plotting >> items in say, Google Maps. >> >> Can you use a secondary index inside mapreduce? I haven't seen any >> examples of it. I have only seen a secondary index being used to feed a >> mapreduce. I am new to Riak. >> >> I imagine my number of points would be at most 100 items per square km, >> but typically less than 1 per square km. A 1 km resolution would be >> sufficient. A 32 bit geohash would cover that fine. Vast regions of the >> Earth would contain no points at all. >> >> -Mark >> >> >> On Tue, May 1, 2012 at 1:16 PM, Sean Cribbs <s...@basho.com> wrote: >> >>> In contrast to Alexander's assessment, I'd say "it depends". I have >>> built some geospatial indexes on top of Riak using a geohashing scheme >>> based on the Hilbert space-filling curve. However, I had to choose specific >>> levels of "zoom" and precompute them. Now that we have secondary indexes, >>> you could perhaps bypass the precomputation step. In general, if you know >>> the geometry of the space you want to query, you can fairly trivially >>> compute the names of the geohashes you need to look up and then either >>> fetch individual keys for those (if you precompute them), or use MapReduce >>> to fetch a range of them. It's not automatic, for sure, but the greatest >>> complexity will be in deciding which granularities of index to support. >>> >>> On Tue, May 1, 2012 at 12:44 PM, Alexander Sicular >>> <sicul...@gmail.com>wrote: >>> >>>> My advice is to not use Riak. Check mongo or Postgres. >>>> >>>> >>>> @siculars on twitter >>>> http://siculars.posterous.com >>>> >>>> Sent from my iRotaryPhone >>>> >>>> On May 1, 2012, at 9:18, Mark Rose <markr...@markrose.ca> wrote: >>>> >>>> > Hello everyone! >>>> > >>>> > I'm going to be implementing Riak as a storage engine for geographic >>>> data. Research has lead me to using geohashing as a useful way to filter >>>> out results outside of a region of interest. However, I've run into some >>>> stumbling blocks and I'm looking for advice on the best way to proceed. >>>> > >>>> > Querying efficiently by geohash involves querying several regions >>>> around a point. From what I can tell, Riak offers no way to query a >>>> secondary index with multiple ranges. Having to query a several ranges, >>>> merge them in the application layer, then pass them off to mapreduce seems >>>> rather silly (and could mean passing GBs of data). Alternatively, I could >>>> start straight with mapreduce, but key filtering seems to work only with >>>> the primary key, which would force me into using the geohashed location as >>>> the primary key (which would lead to collisions if two things existed at >>>> the same point). I'd also like to avoid using the primary key as the >>>> geohash as if the item moves I'd have to change all the references to it. >>>> Lastly, I could do a less efficient mapreduce over a less precise geohash, >>>> but this doesn't solve the issue of the equator (anything near the equator >>>> would require mapreducing the entire dataset). >>>> > >>>> > Is there any way to query multiple ranges with a secondary index and >>>> pass that off to mapreduce? Or should I just stick with the less efficient >>>> mapreduce, and when near the equator, run two queries and later merge them? >>>> Or am I going about this the wrong way? >>>> > >>>> > In any case, the final stage of my queries will involve mapreduce as >>>> I'll need to further filter the items found in a region. >>>> > >>>> > Thank you, >>>> > Mark >>>> > _______________________________________________ >>>> > riak-users mailing list >>>> > riak-users@lists.basho.com >>>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>> >>>> _______________________________________________ >>>> riak-users mailing list >>>> riak-users@lists.basho.com >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>> >>> >>> >>> >>> -- >>> Sean Cribbs <s...@basho.com> >>> Software Engineer >>> Basho Technologies, Inc. >>> http://basho.com/ >>> >>> >> >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com