Hi Gustavo,This sounds like a great idea.
I notice the link  
limitations<https://confluentinc.atlassian.net/wiki/spaces/FLINK/pages/4342875697/FLIP-516+Multi-Way+Join+Operator#Limitations>
 in the Flip points outside of the document to something I do not have access 
to. Please could you include the limitations in the flip itself.

You mention re ordered binary joins might be less efficient by turning them 
into a multi join. I wonder what the pros and cons are. I wonder can we use 
statistics to decide whether we should do a multi way join? In this case we 
could have an enum configuration something like: table.optimizer.join= 
binary-join, multi-join, auto.

Kind regards, David.


From: Arvid Heise <ar...@apache.org>
Date: Monday, 28 April 2025 at 12:47
To: dev@flink.apache.org <dev@flink.apache.org>
Subject: [EXTERNAL] Re: [DISCUSS] FLIP-516: Multi-Way Join Operator
Hi Gustavo,

the idea and approach LGTM. +1 to proceed.

Best,

Arvid

On Thu, Apr 24, 2025 at 4:58 PM Gustavo de Morais <gustavopg...@gmail.com>
wrote:

> Hi everyone,
>
> I'd like to propose FLIP-516: Multi-Way Join Operator [1] for discussion.
>
> Chained non-temporal joins in Flink SQL often cause a "big state issue" due
> to large intermediate results, impacting performance and stability. This
> FLIP introduces a StreamingMultiJoinOperator to tackle this by joining
> multiple inputs (that need to share a common key) simultaneously within one
> operator.
>
> The main goal is achieving zero intermediate state for these common join
> patterns, significantly reducing state size. This initial version requires
> a common partitioning key and focuses on INNER/LEFT joins, with plans for
> future expansion. The operator is opt-in via
> table.optimizer.multi-join.enabled (default false). PR with the initial
> version of the operator is available [2].
>
> Happy to be contributing to this community, and looking forward to your
> feedback and thoughts.
>
> Kind regards,
> Gustavo de Morais
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-516%3A+Multi-Way+Join+Operator
>
> [2] https://github.com/apache/flink/pull/26313
>

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: Building C, IBM Hursley Office, Hursley Park Road, 
Winchester, Hampshire SO21 2JN

Reply via email to