Re: UniqueKey constraint is lost with multiple sources join in SQL

Kai Fu Thu, 08 Apr 2021 05:30:22 -0700

As identified with the community, it's bug and more information in issue
https://issues.apache.org/jira/browse/FLINK-22113


On Sat, Apr 3, 2021 at 8:43 PM Kai Fu <[email protected]> wrote:

> Hi team,
>
> We have a use case to join multiple data sources to generate a
> continuous updated view. We defined primary key constraint on all the input
> sources and all the keys are the subsets in the join condition. All joins
> are left join.
>
> In our case, the first two inputs can produce *JoinKeyContainsUniqueKey *input
> sepc, which is good and performant. While when it comes to the third input
> source, it's joined with the intermediate output table of the first two
> input tables, and the intermediate table does not carry key constraint
> information(although the thrid source input table does), so it results in a
> *NoUniqueKey* input sepc. Given NoUniqueKey inputs has dramatic
> performance implications per the Force Join Unique Key
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Force-Join-Unique-Key-td39521.html#a39651>
> email thread, we want to know if there is any mitigation plan for this.
>
> One solution I can come up with is to write the intermediate result into
> some place like Kafka with unique constraint and join with the
> third source, while it requires extra resources. Any other suggestion on
> this? Thanks.
>
> --
> *Best regards,*
> *- Kai*
>


-- 
*Best wishes,*
*- Kai*

Re: UniqueKey constraint is lost with multiple sources join in SQL

Reply via email to