As identified with the community, it's bug and more information in issue https://issues.apache.org/jira/browse/FLINK-22113
On Sat, Apr 3, 2021 at 8:43 PM Kai Fu <[email protected]> wrote: > Hi team, > > We have a use case to join multiple data sources to generate a > continuous updated view. We defined primary key constraint on all the input > sources and all the keys are the subsets in the join condition. All joins > are left join. > > In our case, the first two inputs can produce *JoinKeyContainsUniqueKey *input > sepc, which is good and performant. While when it comes to the third input > source, it's joined with the intermediate output table of the first two > input tables, and the intermediate table does not carry key constraint > information(although the thrid source input table does), so it results in a > *NoUniqueKey* input sepc. Given NoUniqueKey inputs has dramatic > performance implications per the Force Join Unique Key > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Force-Join-Unique-Key-td39521.html#a39651> > email thread, we want to know if there is any mitigation plan for this. > > One solution I can come up with is to write the intermediate result into > some place like Kafka with unique constraint and join with the > third source, while it requires extra resources. Any other suggestion on > this? Thanks. > > -- > *Best regards,* > *- Kai* > -- *Best wishes,* *- Kai*
