We have some prototypes for how to expand Pipe's capabilities thanks to Chris Meiklejohn. We have not exposed it directly and I honestly think it may be the fundamentally wrong level of abstraction to present to users. It could be a way to implement higher-level query/processing features, but that has not panned out and is low on our priorities at this time.
On Thu, Aug 21, 2014 at 11:55 AM, Alexander Sicular <sicul...@gmail.com> wrote: > Re. Riak pipes. What's the latest regarding accessing the pipe framework? > Haven't heard toooo much about it lately, admittedly haven't been listening > toooo hard either. The thought would be to do "storm"ish stream processing in > situ. > > > @siculars > http://siculars.posthaven.com > > Sent from my iRotaryPhone > >> On Aug 20, 2014, at 22:28, Sargun Dhillon <sar...@sargun.me> wrote: >> >> I second John's opinions. Generally, I would have have one key which >> is the secondary index, being an observe-remove OR-Set (or a relevant >> type for your application, be a register, g-set, or a plain old >> OR-set) pointing to back to the keys. Unfortunately, this mechanism >> can become quite unwieldy in when you have a term with a high >> cardinality. >> >> Now, moving onto the Twitter use case, you care a lot about the speed. >> With this strategy, if you're doing this from a client where you (1) >> read the 2i OR-set, and then (2) read the keys, that can be expensive >> as you have to read the entire 2i value back to the Riak client before >> reading any of the keys. An example, the hashtag #beiber, would have >> high cardinality would result in a super large value, and reading that >> back over the network would be less than awesome. Also, having to pass >> this value around disterl would be poor. Fortunately, the folks at >> Basho have invented riak_pipe. Riak_pipe is a method to allow running >> the read locally on the node the 2i lives on, and then streaming reads >> for all of those keys to the nodes that they live on, and then all >> back to the reader. It's actually the framework that Riak MR uses >> under the hood. >> Talk: https://vimeo.com/53910999 (there might be newer ones as well) >> Docs: https://github.com/basho/riak_pipe >> >> Also, to deal with high-cardinality values, there are a variety of >> work arounds, such as sharding the secondary index to some known set >> of keys, and doing a read across these list of keys. Also, you can >> postfix a nonce to the 2i-key, and ensure that they all end up on one >> node (custom hashing function), or a subset of nodes, and utilize >> leveldb's key iteration over a range to handle this. >> >> The general patterns I like for 2i atop Riak is to specialize Peter >> Bailis, from UC Berkeley's work for RAMPs. If you build the framework >> for this, it'll be all sorts of useful in the future. One warning is >> that there is no easy way to garbage collect in Riak today. >> Paper: http://www.bailis.org/papers/ramp-sigmod2014.pdf >> Talk: https://www.youtube.com/watch?v=_rAdJkAbGls >> >> None of these methods gracefully handle range queries. You can do >> clever things with your 2i to handle this, but it the Twitter use case >> wouldn't need ranges. >> >>> On Wed, Aug 20, 2014 at 12:28 PM, John Daily <jda...@basho.com> wrote: >>> I don't have benchmarks to discuss query performance for different tools at >>> different sizes, but I'd like to point out that the ultimate search tool for >>> Riak is to not search at all. >>> >>> Riak Search, 2i, MapReduce are all capable tools, but they don't scale >>> nearly as well as straight key/value requests, and it is often possible to >>> model your data around the latter. >>> >>> I covered this in https://basho.com/riak-development-anti-patterns/ and the >>> next edition of Eric Redmond's Little Riak Book (http://littleriakbook.com) >>> will have more discussion on the topic, but if at all possible, create your >>> query results as reports as the data is ingested, instead of attempting to >>> find it all later. >>> >>> -John >>> >>> >>> >>> On Wed, Aug 20, 2014 at 3:21 PM, Alex De la rosa <alex.rosa....@gmail.com> >>> wrote: >>>> >>>> Any thoughts about this? >>>> >>>> One thing it worries me about Riak Search is that if one index has several >>>> millions of object to search for maybe it becomes slow? 2i might be faster >>>> then? >>>> >>>> Thanks! >>>> Alex >>>> >>>> >>>> On Tue, Aug 19, 2014 at 8:47 AM, Alex De la rosa <alex.rosa....@gmail.com> >>>> wrote: >>>>> >>>>> Hi there, >>>>> >>>>> I had been seeing lately Riak Search as an ultimate way to query Riak... >>>>> and it seems recommended to use over MapReduce and even 2i... said so... >>>>> should we try to always use Riak Search over the other systems? >>>>> >>>>> Is there any situation in which MapReduce could be a better approach than >>>>> Riak Search? >>>>> >>>>> Same goes for 2i... I believe 2i is an optimal approach if you just want >>>>> keys and know very well what are you looking for, but out of that, should >>>>> Riak Search try to replace all 2i uses? >>>>> >>>>> Practical example: If you are twitter and want to get twits for the >>>>> hashtag #Riak, what would be the best approach? 2i? Riak Search? >>>>> MapReduce? >>>>> >>>>> Thanks! >>>>> Alex >>>> >>>> >>>> >>>> _______________________________________________ >>>> riak-users mailing list >>>> riak-users@lists.basho.com >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >>> >>> _______________________________________________ >>> riak-users mailing list >>> riak-users@lists.basho.com >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com -- Sean Cribbs <s...@basho.com> Software Engineer Basho Technologies, Inc. http://basho.com/ _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com