[ 
https://issues.apache.org/jira/browse/FLINK-18830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203617#comment-17203617
 ] 

Jark Wu edited comment on FLINK-18830 at 9/29/20, 3:48 AM:
-----------------------------------------------------------

I agree with [~aljoscha]. I'm pretty sure the current window join in DataStream 
API can't satisfy the Table/SQL's needs in the terms of functinality and 
performance. That means we may have to have a customized implementation. 


was (Author: jark):
I agree with [~aljoscha]. I'm pretty sure the current window join in DataStream 
API can't satisfy the Table/SQL's needs in the terms of functinality and 
performance. That means we may need to have an customized implementation 
anyway. 

> JoinCoGroupFunction and FlatJoinCoGroupFunction work incorrectly for outer 
> join when one side of coGroup is empty
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-18830
>                 URL: https://issues.apache.org/jira/browse/FLINK-18830
>             Project: Flink
>          Issue Type: Bug
>          Components: API / DataStream
>    Affects Versions: 1.11.1
>            Reporter: liupengcheng
>            Priority: Major
>
> Currently, The {{JoinCoGroupFunction}} and {{FlatJoinCoGroupFunction}} in 
> JoinedStreams doesn't respect the join type, it's been implemented as doing 
> join within a two-level loop. However, this is incorrect for outer join when 
> one side of the coGroup is empty.
> {code}
>       public void coGroup(Iterable<T1> first, Iterable<T2> second, 
> Collector<T> out) throws Exception {
>                       for (T1 val1: first) {
>                               for (T2 val2: second) {
>                                       wrappedFunction.join(val1, val2, out);
>                               }
>                       }
>               }
> {code}
> The above code is the current implementation, suppose the first input is 
> non-empty, and the second input is an empty iterator, then the join 
> function(`wrappedFunction`) will never be called. This will cause no data to 
> be emitted for a left outer join.
> So I propose to consider join type here, and handle this case, e.g., for left 
> outer join, we can emit record with right side set to null here if the right 
> side is empty or can not find any match in the right side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to