Hi,

I think the FLIP is in a fairly good state, +1 for the idea and the given
design. This may be considered already, but IMO we should also add some
high-level details, pros, and cons of enabling this feature to the website
other than the config option description.

Best,
Ferenc




On Friday, May 2nd, 2025 at 14:47, Gustavo de Morais <gustavopg...@gmail.com> 
wrote:

> 
> 
> Hey everyone,
> 
> I'd be great to start voting next week. Let me know if there are further
> questions or feedback.
> 
> Thanks,
> Gustavo
> 
> Am Mi., 30. Apr. 2025 um 15:07 Uhr schrieb Gustavo de Morais <
> gustavopg...@gmail.com>:
> 
> > Hey Arvid and David, thanks for the feedback!
> > 
> > The limitations are in the flip, I just had pasted a wrong link and fixed
> > it. Let me know if there are other incorrect links.
> > 
> > Yes, the thought of using statistics has potential. I've also spent some
> > on that. The precise statistics required here would however be the amount
> > of intermediate state/matches for each level and this is an information we
> > only have at runtime/inside the operator. For that, we could look into an
> > adaptive multi-way join in a next interaction and the user could determine
> > a max amount of state he's willing to store. This has potential but would
> > be a topic for a next FLIP, I added some information on that under the
> > rejected alternatives.
> > 
> > Kind regards,
> > Gustavo
> > 
> > Am Mo., 28. Apr. 2025 um 14:18 Uhr schrieb David Radley <
> > david_rad...@uk.ibm.com>:
> > 
> > > Hi Gustavo,This sounds like a great idea.
> > > I notice the link limitations<
> > > https://confluentinc.atlassian.net/wiki/spaces/FLINK/pages/4342875697/FLIP-516+Multi-Way+Join+Operator#Limitations>
> > > in the Flip points outside of the document to something I do not have
> > > access to. Please could you include the limitations in the flip itself.
> > > 
> > > You mention re ordered binary joins might be less efficient by turning
> > > them into a multi join. I wonder what the pros and cons are. I wonder can
> > > we use statistics to decide whether we should do a multi way join? In this
> > > case we could have an enum configuration something like:
> > > table.optimizer.join= binary-join, multi-join, auto.
> > > 
> > > Kind regards, David.
> > > 
> > > From: Arvid Heise ar...@apache.org
> > > Date: Monday, 28 April 2025 at 12:47
> > > To: dev@flink.apache.org dev@flink.apache.org
> > > Subject: [EXTERNAL] Re: [DISCUSS] FLIP-516: Multi-Way Join Operator
> > > Hi Gustavo,
> > > 
> > > the idea and approach LGTM. +1 to proceed.
> > > 
> > > Best,
> > > 
> > > Arvid
> > > 
> > > On Thu, Apr 24, 2025 at 4:58 PM Gustavo de Morais <gustavopg...@gmail.com
> > > 
> > > wrote:
> > > 
> > > > Hi everyone,
> > > > 
> > > > I'd like to propose FLIP-516: Multi-Way Join Operator [1] for
> > > > discussion.
> > > > 
> > > > Chained non-temporal joins in Flink SQL often cause a "big state issue"
> > > > due
> > > > to large intermediate results, impacting performance and stability. This
> > > > FLIP introduces a StreamingMultiJoinOperator to tackle this by joining
> > > > multiple inputs (that need to share a common key) simultaneously within
> > > > one
> > > > operator.
> > > > 
> > > > The main goal is achieving zero intermediate state for these common join
> > > > patterns, significantly reducing state size. This initial version
> > > > requires
> > > > a common partitioning key and focuses on INNER/LEFT joins, with plans
> > > > for
> > > > future expansion. The operator is opt-in via
> > > > table.optimizer.multi-join.enabled (default false). PR with the initial
> > > > version of the operator is available [2].
> > > > 
> > > > Happy to be contributing to this community, and looking forward to your
> > > > feedback and thoughts.
> > > > 
> > > > Kind regards,
> > > > Gustavo de Morais
> > > > 
> > > > [1]
> > > 
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-516%3A+Multi-Way+Join+Operator
> > > 
> > > > [2] https://github.com/apache/flink/pull/26313
> > > 
> > > Unless otherwise stated above:
> > > 
> > > IBM United Kingdom Limited
> > > Registered in England and Wales with number 741598
> > > Registered office: Building C, IBM Hursley Office, Hursley Park Road,
> > > Winchester, Hampshire SO21 2JN

Reply via email to