I see, I am gonna try this.
Thanks Hequn
*--*
*-- Felipe Gutierrez*

*-- skype: felipe.o.gutierrez*
*--* *https://felipeogutierrez.blogspot.com
<https://felipeogutierrez.blogspot.com>*


On Thu, Aug 15, 2019 at 4:01 AM Hequn Cheng <chenghe...@gmail.com> wrote:

> Hi Felipe,
>
> If I understand correctly, you also have to decide whether to broadcast
> the datastream from the right side before performing the function?
>
> One option is you can add a Util method to join dynamically, e.g.,
> Util.joinDynamically(ds1, ds2). In the util method, you can implement your
> own strategy logic and decide whether to broadcast or use CoProcessFunction.
>
> Best, Hequn
>
> On Wed, Aug 14, 2019 at 3:07 PM Felipe Gutierrez <
> felipe.o.gutier...@gmail.com> wrote:
>
>> Hi Hequn,
>>
>> I am implementing the broadcast and the regular join. As you said I need
>> different functions. My question is more about if I can have an operator
>> which decides beteween broadcast and regular join dynamically. I suppose I
>> will have to extend the generic TwoInputStreamOperator in Flink. Do you
>> have any suggestion?
>>
>> Thanks
>>
>> On Wed, 14 Aug 2019, 03:59 Hequn Cheng, <chenghe...@gmail.com> wrote:
>>
>>> Hi Felipe,
>>>
>>> > I want to implement a join operator which can use different strategies
>>> for joining tuples.
>>> Not all kinds of join strategies can be applied to streaming jobs. Take
>>> sort-merge join as an example, it's impossible to sort an unbounded data.
>>> However, you can perform a window join and use the sort-merge strategy to
>>> join the data within a window. Even though, I'm not sure it's worth to do
>>> it considering the performance.
>>>
>>> > Therefore, I am not sure if I will need to implement my own operator
>>> to do this or if it is still possible to do with CoProcessFunction.
>>> You can't implement broadcast join with CoProcessFunction. But you can
>>> implement it with BroadcastProcessFunction or
>>> KeyedBroadcastProcessFunction, more details here[1].
>>>
>>> Furthermore, you can take a look at the implementation of both window
>>> join and non-window join in Table API & SQL[2]. The code can be found
>>> here[3].
>>> Hope this helps.
>>>
>>> Best, Hequn
>>>
>>> [1]
>>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html
>>> [2]
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/sql.html#joins
>>> [3]
>>> https://github.com/apache/flink/tree/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/runtime/join
>>>
>>>
>>> On Tue, Aug 13, 2019 at 11:30 PM Felipe Gutierrez <
>>> felipe.o.gutier...@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I want to implement a join operator which can use different strategies
>>>> for joining tuples. I saw that with CoProcessFunction I am able to
>>>> implement low-level joins [1]. However, I do know how to decide between
>>>> different algorithms to join my tuples.
>>>>
>>>> On the other hand, to do a broadcast join I will need to use the
>>>> broadcast operator [2] which yields a BroadcastStream. Therefore, I am not
>>>> sure if I will need to implement my own operator to do this or if it is
>>>> still possible to do with CoProcessFunction.
>>>>
>>>> Does anyone have some clues for this matter?
>>>> Thanks
>>>>
>>>> [1]
>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/operators/process_function.html#low-level-joins
>>>> [2]
>>>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html
>>>> *--*
>>>> *-- Felipe Gutierrez*
>>>>
>>>> *-- skype: felipe.o.gutierrez*
>>>> *--* *https://felipeogutierrez.blogspot.com
>>>> <https://felipeogutierrez.blogspot.com>*
>>>>
>>>

Reply via email to