We have some prototypes for how to expand Pipe's capabilities thanks
to Chris Meiklejohn. We have not exposed it directly and I honestly
think it may be the fundamentally wrong level of abstraction to
present to users. It could be a way to implement higher-level
query/processing features, but that has not panned out and is low on
our priorities at this time.

On Thu, Aug 21, 2014 at 11:55 AM, Alexander Sicular <sicul...@gmail.com> wrote:
> Re. Riak pipes. What's the latest regarding accessing the pipe framework? 
> Haven't heard toooo much about it lately, admittedly haven't been listening 
> toooo hard either. The thought would be to do "storm"ish stream processing in 
> situ.
>
>
> @siculars
> http://siculars.posthaven.com
>
> Sent from my iRotaryPhone
>
>> On Aug 20, 2014, at 22:28, Sargun Dhillon <sar...@sargun.me> wrote:
>>
>> I second John's opinions. Generally, I would have have one key which
>> is the secondary index, being an observe-remove OR-Set (or a relevant
>> type for your application, be a register, g-set, or a plain old
>> OR-set) pointing to back to the keys. Unfortunately, this mechanism
>> can become quite unwieldy in when you have a term with a high
>> cardinality.
>>
>> Now, moving onto the Twitter use case, you care a lot about the speed.
>> With this strategy, if you're doing this from a client where you (1)
>> read the 2i OR-set, and then (2) read the keys, that can be expensive
>> as you have to read the entire 2i value back to the Riak client before
>> reading any of the keys. An example, the hashtag #beiber, would have
>> high cardinality would result in a super large value, and reading that
>> back over the network would be less than awesome. Also, having to pass
>> this value around disterl would be poor. Fortunately, the folks at
>> Basho have invented riak_pipe. Riak_pipe is a method to allow running
>> the read locally on the node the 2i lives on, and then streaming reads
>> for all of those keys to the nodes that they live on, and then all
>> back to the reader. It's actually the framework that Riak MR uses
>> under the hood.
>> Talk: https://vimeo.com/53910999 (there might be newer ones as well)
>> Docs: https://github.com/basho/riak_pipe
>>
>> Also, to deal with high-cardinality values, there are a variety of
>> work arounds, such as sharding the secondary index to some known set
>> of keys, and doing a read across these list of keys. Also, you can
>> postfix a nonce to the 2i-key, and ensure that they all end up on one
>> node (custom hashing function), or a subset of nodes, and utilize
>> leveldb's key iteration over a range to handle this.
>>
>> The general patterns I like for 2i atop Riak is to specialize Peter
>> Bailis, from UC Berkeley's work for RAMPs. If you build the framework
>> for this, it'll be all sorts of useful in the future. One warning is
>> that there is no easy way to garbage collect in Riak today.
>> Paper: http://www.bailis.org/papers/ramp-sigmod2014.pdf
>> Talk: https://www.youtube.com/watch?v=_rAdJkAbGls
>>
>> None of these methods gracefully handle range queries. You can do
>> clever things with your 2i to handle this, but it the Twitter use case
>> wouldn't need ranges.
>>
>>> On Wed, Aug 20, 2014 at 12:28 PM, John Daily <jda...@basho.com> wrote:
>>> I don't have benchmarks to discuss query performance for different tools at
>>> different sizes, but I'd like to point out that the ultimate search tool for
>>> Riak is to not search at all.
>>>
>>> Riak Search, 2i, MapReduce are all capable tools, but they don't scale
>>> nearly as well as straight key/value requests, and it is often possible to
>>> model your data around the latter.
>>>
>>> I covered this in https://basho.com/riak-development-anti-patterns/ and the
>>> next edition of Eric Redmond's Little Riak Book (http://littleriakbook.com)
>>> will have more discussion on the topic, but if at all possible, create your
>>> query results as reports as the data is ingested, instead of attempting to
>>> find it all later.
>>>
>>> -John
>>>
>>>
>>>
>>> On Wed, Aug 20, 2014 at 3:21 PM, Alex De la rosa <alex.rosa....@gmail.com>
>>> wrote:
>>>>
>>>> Any thoughts about this?
>>>>
>>>> One thing it worries me about Riak Search is that if one index has several
>>>> millions of object to search for maybe it becomes slow? 2i might be faster
>>>> then?
>>>>
>>>> Thanks!
>>>> Alex
>>>>
>>>>
>>>> On Tue, Aug 19, 2014 at 8:47 AM, Alex De la rosa <alex.rosa....@gmail.com>
>>>> wrote:
>>>>>
>>>>> Hi there,
>>>>>
>>>>> I had been seeing lately Riak Search as an ultimate way to query Riak...
>>>>> and it seems recommended to use over MapReduce and even 2i... said so...
>>>>> should we try to always use Riak Search over the other systems?
>>>>>
>>>>> Is there any situation in which MapReduce could be a better approach than
>>>>> Riak Search?
>>>>>
>>>>> Same goes for 2i... I believe 2i is an optimal approach if you just want
>>>>> keys and know very well what are you looking for, but out of that, should
>>>>> Riak Search try to replace all 2i uses?
>>>>>
>>>>> Practical example: If you are twitter and want to get twits for the
>>>>> hashtag #Riak, what would be the best approach? 2i? Riak Search? 
>>>>> MapReduce?
>>>>>
>>>>> Thanks!
>>>>> Alex
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> riak-users@lists.basho.com
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



-- 
Sean Cribbs <s...@basho.com>
Software Engineer
Basho Technologies, Inc.
http://basho.com/

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to