Re: [DISCUSS] FLIP-579: LATERAL SNAPSHOT Join

Fabian Hueske Wed, 03 Jun 2026 10:24:28 -0700

Thanks Leonard,

I've added a section about operator metrics to the proposal [1].
If you have ideas for other useful metrics, please let me know and I'll add
them.


Best, Fabian

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=421958523#FLIP579:LATERALSNAPSHOTJoin-Metrics

Am Mi., 3. Juni 2026 um 10:18 Uhr schrieb Fabian Hueske <[email protected]
>:

> Thank you Hongshun for your feedback!
>
> You are right that restricting the probe-side input to append-only changes
> deviates from the existing event-time (and disabled proc-time) temporal
> table joins.
> I chose this restriction because processing time semantics do not work
> well with retractions.
> A probe record of +(k1, p1) could join against a build-side of (k1, v1),
> while a later arriving probe-side retraction -(k1, p1) would join against
> (k1, v2). The resulting rectraction record of -(k1, p1, v2) would not match
> the earlier insertion +(k1, p1, v1).
> The only changelog format that would work well would be upsert changes
> (+I, +UA, -D) with key-only deletes (under the assumption that the upsert
> key is still unique after the join).
>
> I would like to keep the default behavior of only accepting append-only
> inputs to prevent retraction mismatches.
> At the same time, the operator implementation only needs to be slightly
> adjusted to support arbitrary probe-side changes (if at all). The real
> change would be in the planner / rules.
> So allowing probe-side retraction input via a config option for
> power-users who know what they are doing is certainly possible.
>
> Do you think this should be part of the proposal or would you be fine to
> leave this as future work?
>
> Best, Fabian
>
> Am Mi., 3. Juni 2026 um 08:41 Uhr schrieb Leonard Xu <[email protected]>:
>
>> Hi Fabian,
>>
>> Thanks for the detailed and thoughtful reply, and especially for agreeing
>> to add operator metrics — that alone will significantly improve the
>> debuggability of the join in production.
>>
>>  (1) On CPU spike / probe-side buffering
>>
>>  +1 for your reasoning on both points — deferring InputSelectable until
>> the unaligned-checkpoint limitation is resolved, and leaving micro-batch
>> transition out of v1. Watermark alignment + probe-side scan-offset is the
>> principled clean approach, agreed.
>>
>>  (2) On Jark's backlog idea
>>
>> Between the two paths in your reply, I'd lean toward option (a) — having
>> the source connector emit a special WM (or a dedicated record attribute) at
>> the end of backlog — as the long-term direction for exact flip-point
>> semantics. It naturally fits what is almost certainly going to be the
>> dominant production scenario: CDC sources (mysql-cdc, mongodb-cdc,
>> postgres-cdc, ...) as the dimension build-side, all of which already have
>> an explicit "snapshot finished → binlog start" boundary internally.
>>
>>  My suggestion is to ship FLIP-579 as proposed and collect user feedback
>> on LATERAL SNAPSHOT Join from real production usage — that will give us
>> solid evidence to prioritize this follow-up direction.
>>
>>  (3) On the vote
>>
>> Once the metrics section is added and there are no further objections
>> from others, I'm fine to start the vote.
>>
>>
>> Best,
>> Leonard
>>
>>
>>
>> > 2026 6月 3 01:36，Fabian Hueske <[email protected]> 写道：
>> >
>> > Thanks for your valuable feedback Jark and Leonard!
>> >
>> > You are bringing up three of the tricky challenges that the new join
>> needs
>> > to deal with.
>> >
>> > (1) Jark: The build-side flip point is not exact
>> >
>> > This is correct. However, I would argue that a processing-time join does
>> > not have exact guarantees anyway and can only produce roughly
>> time-aligned
>> > results. WM alignment of build and probe-side input should help to keep
>> the
>> > alignment somewhat close. Of course, this does not mean that we
>> shouldn't
>> > try to give as good guarantees as possible.
>> >
>> > The primary mechanism for flipping from LOAD to JOIN phase is the
>> > build-side watermark crossing a configured point in time. Watermarks are
>> > used to track progress and completeness in Flink. Using them as a
>> condition
>> > to switch from LOAD to JOIN phase, means that the build-side received at
>> > least all changes up to that point in time. There might have been
>> changes
>> > with later timestamps as well. These could be buffered on the side to
>> have
>> > a stricter FLIP point, but IMO this additional data should be tolerable
>> > under proc-time semantics.
>> >
>> > If the build-side input becomes stale, the processing idle timeout flip
>> > condition gets applied. The assumption is that the build-side source is
>> > currently exhausted and all data was consumed but the WM didn't progress
>> > far enough to exceed the flip point. In this case, we want to flip and
>> > start the regular JOIN phase.
>> >
>> > For the use case of sources with an exact flip point, users would need
>> to
>> > know the timestamp of the last backlog record (or compute it if they
>> know
>> > roughly how long it takes to scan the backlog if it is computed
>> on-demand).
>> > I agree, this is not very practical.
>> > I can think of two options
>> >  a) the source connector emits a special WM when it reaches the end of
>> the
>> > backlog. This would not require changes to the join operator but to the
>> > source connectors.
>> >  b) The design of the SNAPSHOT function has the
>> `load_completed_condition`
>> > which is an extension point to add logic to determine the flip point.
>> >
>> >
>> > (2) Jark: Buffering probe-side during LOAD phase
>> >
>> > I think this is very similar to Leonard's point about "LOAD phase
>> > backpressure" and also closely related to Leonard's point about
>> "Flip-point
>> > CPU spike".
>> >
>> > This is indeed a potential problem. If used without care, the probe-side
>> > state might grow very large. Before talking about possible ways to
>> address
>> > this problem, let me explain how I think that the join would be used.
>> >
>> > A very common (maybe the most common?) use case should be to initialize
>> the
>> > build-side input up to time t_b and then start processing the probe-side
>> > input from time t_p (with t_p = t_b, or slightly less than t_b)
>> onwards. WM
>> > alignment would help to roughly align build-side and probe-side inputs
>> > (although not being perfectly aligned like the event-time join)
>> > Initializing the build-side up to t_b and starting consuming the
>> probe-side
>> > from t_p with (t_p << t_b) would mean that the first probe-side records
>> are
>> > joined with much later versions of the build-side.
>> >
>> > The first scenario (t_p = t_b) can be controlled with WM alignment and
>> > scan-offset table hints on the probe-side input. Since the WMs of the
>> > build-side input would be less than the WMs of the probe-side input, the
>> > probe-side input should be throttled until the build-side caught up
>> (which
>> > should be close the the flip point).
>> > Other scenarios (including the t_p << t_b scenario) would benefit from
>> an
>> > idea [1] that is described in the future work section of the FLIP. That
>> > mechanism would also be based on WM alignment and would need some
>> > collaboration from the build-side source operator to indicate
>> completeness
>> > of the backlog.
>> >
>> > Using the InputSelectable interface is an idea that we also looked into
>> (as
>> > Gustavo already pointed out). Unfortunately, it is incompatible with
>> > unaligned checkpoints and there are no other streaming operators that
>> > implement the interface. I haven't looked in depth at the current
>> > limitations, but if some of these would be resolved, it might be
>> possible
>> > to later extend the join operator. It might even work with relaxed
>> > guarantees because we don't need to fully block the input but just
>> throttle
>> > it such that less probe-side data needs to be buffered.
>> >
>> > The idea of limiting the size of the probe-side buffer with a config
>> like
>> > `max-buffer-size` sounds interesting. However, I'm not sure if applying
>> > backpressure would really work because we still need to consume the
>> > build-side to be able to reach the flip point and selective
>> backpressure is
>> > not possible without InputSelectible.
>> >
>> > An earlier draft of the proposal described eager joining (now moved to
>> the
>> > future work section [2]). The idea is that a probe-side record would be
>> > directly joined when a match was present or received during the LOAD
>> phase.
>> > After joining it wouldn't be put into state. This would of course
>> > significantly reduce the state size and solved the issue of CPU spikes
>> > during the transition but come at the cost of hard-to-explain semantics
>> > (the current semantics are rather simple, we collect until the flip
>> point
>> > and join against that).
>> > Also, the idea of eager joining was developed when the join operator was
>> > still restricted to FK-PK joins (single build-side record per join key).
>> > The design generalized this restriction to arbitrary joins which means
>> > there might be more build-side matches for a probe-side record such that
>> > the presence of a single join match (of a possibly earlier version) does
>> > not guarantee completeness anymore. That's why the idea of eager joining
>> > was discarded for now.
>> >
>> >
>> > (3) Leonard: Flip-point CPU spikes
>> >
>> > This is also a very valid concern. I would argue that the primary
>> mechanism
>> > to address this point should be to reduce the amount of buffered
>> probe-side
>> > records (see point (2) above).
>> >
>> > I also thought about your idea to micro-batch the draining. In the
>> design,
>> > the transition join is triggered per-key by event-time timers that also
>> > emit the "current" probe-side WM downstream. If we want to continue
>> using
>> > this mechanism, we would need to schedule multiple timers and probably
>> use
>> > some clever mechanism to gradually advance probe-side WMs while still
>> > consuming records. There could be a separate "TRANSITION" phase during
>> > which we still append to the probe-buffer but use the mechanism for
>> atomic
>> > build-side updates. However, this would significantly complicate the
>> design
>> > affecting the control flow and recovery. Hence, I would first try to
>> > address this issue by reducing the probe-side state.
>> >
>> > If you have better ideas for how to use micro-batching during flip
>> > transition, I'm very open to exploring those.
>> >
>> > Leonard also brought up a very important point about metrics!
>> > I will add a section on operator metrics that will help to understand
>> the
>> > state of the operator.
>> >
>> > Please let me know your thoughts!
>> >
>> > Best, Fabian
>> >
>> > [1]
>> >
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=421958523#FLIP579:LATERALSNAPSHOTJoin-ReduceBufferingofProbe-SideviaBuild-SideWatermarkSuppression
>> > [2]
>> >
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=421958523#FLIP579:LATERALSNAPSHOTJoin-EagerjoinmodeduringLOADphase
>> >
>> >
>> > Am Mo., 1. Juni 2026 um 16:44 Uhr schrieb Leonard Xu <[email protected]
>> >:
>> >
>> >> Hi Fabian
>> >>
>> >> Thanks for driving this FLIP and your kind patience. The motivation is
>> >> spot-on, and the LOAD→JOIN two-phase design is the right structural
>> fix for
>> >> the FLINK-19830 initialization problem. Overall direction +1 from my
>> side.
>> >>
>> >> Besides Jark’s idea about backlog and InputSelectable which may need
>> more
>> >> prerequisites, I’ve two concerns about current proposal:
>> >>
>> >> 1. LOAD phase backpressure. The FLIP assumes "seconds to a few minutes"
>> >> for build-side init, but nothing enforces it. Large build-side tables
>> >> (e.g., 50M rows) + fast probe streams → unbuffered probe-side state
>> >> explosion. Should we add a config  like max-buffer-size that applies
>> >> backpressure when exceeded or some metrics about buffer, rather than
>> >> silently piling up records?
>> >>
>> >> 2. Flip-point CPU spike. Joining all buffered probe records ×
>> build-side
>> >> state in one shot differs fundamentally from event-time join's
>> incremental
>> >> watermark-batched emission. In the worst case this could cause a
>> >> TaskManager CPU spike and downstream shock. Worth considering
>> micro-batch
>> >> draining during flip transition?
>> >>
>> >> Looking forward to your thoughts.
>> >>
>> >> Best,
>> >> Leonard
>> >>
>> >>> 2026 6月 1 16:02，Fabian Hueske <[email protected]> 写道：
>> >>>
>> >>> Hi Leonard,
>> >>>
>> >>> Sorry, missed your email and already started the vote.
>> >>> Let me put it on hold for now and continue discussing the proposal.
>> >>>
>> >>> Looking forward to your comments,
>> >>> Fabian
>> >>>
>> >>> Am Mo., 1. Juni 2026 um 09:56 Uhr schrieb Leonard Xu <
>> [email protected]
>> >>> :
>> >>>
>> >>>> @Fabian Thanks for driving this FLIP, sorry for late reply due to my
>> >>>> personal reason that I shouldn’t miss such an important FLIP.
>> >>>>
>> >>>> I’m reviewing the FLIP and will try to finish it today, could you
>> kindly
>> >>>> wait one minute to start the vote?
>> >>>>
>> >>>> And sorry for interrupt your plan again.
>> >>>>
>> >>>> Best,
>> >>>> Leonard
>> >>>>
>> >>>>> 2026 6月 1 15:51，Fabian Hueske <[email protected]> 写道：
>> >>>>>
>> >>>>> Thanks everyone for your comments on the FLIP.
>> >>>>> I will start the vote.
>> >>>>>
>> >>>>> Best, Fabian
>> >>>>>
>> >>>>> Am Do., 28. Mai 2026 um 20:13 Uhr schrieb David Anderson <
>> >>>>> [email protected]>:
>> >>>>>
>> >>>>>> Fabian,
>> >>>>>>
>> >>>>>>> So, I don't think that we should buffer unmatched probe-side
>> records
>> >>>>>> beyond
>> >>>>>> the flip point.
>> >>>>>>
>> >>>>>> Thanks for explaining your reasoning. Makes sense to me.
>> >>>>>>
>> >>>>>> David
>> >>>>>>
>> >>>>>> On Thu, May 28, 2026 at 6:55 PM Fabian Hueske <[email protected]>
>> >>>> wrote:
>> >>>>>>
>> >>>>>>> Hi Xingcan,
>> >>>>>>>
>> >>>>>>> Thanks for your comments on the FLIP!
>> >>>>>>>
>> >>>>>>> The join's behavior when starting from a savepoint is indeed an
>> >>>> important
>> >>>>>>> aspect to consider and the problem of a rapidly advancing
>> dimension
>> >>>>>>> (build-side) table is of course real.
>> >>>>>>>
>> >>>>>>> I would argue that watermark alignment should significantly reduce
>> >> the
>> >>>>>>> impact of this.
>> >>>>>>> If enabled, sources align their consumption based on their current
>> >>>>>>> watermark such that the (presumably much smaller) build-side
>> source
>> >>>> would
>> >>>>>>> be slowed down to the event-time progress of the probe-side.
>> >>>>>>> While watermark alignment is not an "exact" mechanism, the
>> semantics
>> >> of
>> >>>>>> the
>> >>>>>>> new processing-time join also do not guarantee "exact" results.
>> >>>>>>> At the same time, alignment should ensure that build and
>> probe-side
>> >> are
>> >>>>>>> roughly aligned in event-time (without the strict guarantees that
>> the
>> >>>>>>> event-time temporal table join provides).
>> >>>>>>>
>> >>>>>>> However, I really like your idea of starting in event-time mode
>> and
>> >>>>>>> flipping to processing-time after the initialization duration
>> passed.
>> >>>>>>> I'm not sure if it would fully address the problem you described.
>> As
>> >>>> you
>> >>>>>>> said, users would need to be able to reconfigure the flip-point
>> and
>> >> I'm
>> >>>>>> not
>> >>>>>>> sure if there's a good mechanism for this yet.
>> >>>>>>> But it might have some other properties that would be beneficial,
>> so
>> >>>> I'll
>> >>>>>>> think about that.
>> >>>>>>>
>> >>>>>>> Best,
>> >>>>>>> Fabian
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> Am Do., 28. Mai 2026 um 18:21 Uhr schrieb Fabian Hueske <
>> >>>>>> [email protected]
>> >>>>>>>> :
>> >>>>>>>
>> >>>>>>>> Thanks for your feedback David!
>> >>>>>>>>
>> >>>>>>>>> One question: If I understand correctly, during the JOIN phase
>> of
>> >> an
>> >>>>>>>> INNER
>> >>>>>>>> join, if the desired build-side record is missing, nothing will
>> be
>> >>>>>>> emitted
>> >>>>>>>> for the unmatched probe-side record. For an INNER join, I can
>> >> imagine
>> >>>>>>>> wanting to buffer unmatched probe-side records, expecting the
>> build
>> >>>>>> side
>> >>>>>>>> will arrive soon. What's your thinking there?
>> >>>>>>>>
>> >>>>>>>> Your understanding is correct. If a probe-side record arrives
>> during
>> >>>>>> LOAD
>> >>>>>>>> phase but no matching build-side record is received,
>> >>>>>>>> the probe-side record would be discarded without being joined
>> during
>> >>>>>> the
>> >>>>>>>> transition from LOAD to JOIN.
>> >>>>>>>>
>> >>>>>>>> I would argue that users that want to prevent this, would need to
>> >>>>>>>> configure a longer initialization time.
>> >>>>>>>> IMO, dropping unmatched probe records is not a "bad" property of
>> >> INNER
>> >>>>>>>> joins but an essential part of their semantics. It might even be
>> >>>>>> desired
>> >>>>>>> by
>> >>>>>>>> some users.
>> >>>>>>>> If we would buffer probe-side records for INNER joins beyond the
>> >>>>>>>> transition point, we:
>> >>>>>>>> * would have different behaviors for INNER and LEFT joins
>> >>>>>>>> * could not start to emit probe-side watermarks as long as there
>> are
>> >>>>>>> still
>> >>>>>>>> probe-side records buffered (or at least not advance past them
>> >> without
>> >>>>>>>> emitting late data at a later point of time)
>> >>>>>>>> * would either need another config knob to specify when to
>> "really"
>> >>>>>> clean
>> >>>>>>>> up the probe-side state or keep such unmatched records forever in
>> >>>> state
>> >>>>>>> (we
>> >>>>>>>> could also use state TTL...)
>> >>>>>>>>
>> >>>>>>>> So, I don't think that we should buffer unmatched probe-side
>> records
>> >>>>>>>> beyond the flip point.
>> >>>>>>>>
>> >>>>>>>> Best, Fabian
>> >>>>>>>>
>> >>>>>>>> Am Do., 28. Mai 2026 um 17:05 Uhr schrieb Xingcan Cui <
>> >>>>>>> [email protected]
>> >>>>>>>>> :
>> >>>>>>>>
>> >>>>>>>>> Hi Fabian,
>> >>>>>>>>>
>> >>>>>>>>> Thanks for this FLIP! The two-phase design is excellent for
>> >> avoiding
>> >>>>>>>>> early-joining bugs while maintaining low-latency processing-time
>> >>>>>>>>> semantics.
>> >>>>>>>>>
>> >>>>>>>>> After thinking more about the proposal, I'd like to point out an
>> >> edge
>> >>>>>>> case
>> >>>>>>>>> related to the initialization phase or recovery after prolonged
>> >>>>>> downtime
>> >>>>>>>>> (for example, when a job has been down for a day). While a
>> >>>>>>> processing-time
>> >>>>>>>>> join works well for live streaming, where results can reasonably
>> >>>>>> depend
>> >>>>>>> on
>> >>>>>>>>> the immediate arrival order of live data, it does not work as
>> well
>> >>>> for
>> >>>>>>>>> catch-up scenarios.
>> >>>>>>>>>
>> >>>>>>>>> Currently, if a job initializes or restores from a checkpoint
>> >> after a
>> >>>>>>> long
>> >>>>>>>>> downtime, the operator resumes directly in the processing-time
>> join
>> >>>>>>> phase.
>> >>>>>>>>> During catch-up, however, the natural chronological arrival
>> order
>> >> of
>> >>>>>> the
>> >>>>>>>>> live data is completely lost. As a result, these replayed fact
>> >>>> records
>> >>>>>>> are
>> >>>>>>>>> evaluated against the current machine time and may blindly join
>> >> with
>> >>>>>> the
>> >>>>>>>>> rapidly advancing "current" dimension snapshot, rather than the
>> >>>>>>> historical
>> >>>>>>>>> versions they were originally supposed to match.
>> >>>>>>>>>
>> >>>>>>>>> To handle this edge case, could we consider:
>> >>>>>>>>>
>> >>>>>>>>> 1. changing the first phase into an event-time join phase, and
>> >>>>>>>>>
>> >>>>>>>>> 2. allowing the operator to switch back to the first phase
>> after a
>> >>>>>>>>> restart?
>> >>>>>>>>>
>> >>>>>>>>> For example, users could configure a timestamp threshold. Before
>> >> the
>> >>>>>>>>> watermark reaches that point, the operator would run as an
>> >> event-time
>> >>>>>>>>> versioned join to safely process the catch-up phase through
>> >> watermark
>> >>>>>>>>> alignment. Once the watermark passes the threshold, the operator
>> >>>> could
>> >>>>>>>>> purge the old multi-version state and seamlessly transition
>> back to
>> >>>>>> the
>> >>>>>>>>> pure processing-time join phase for live traffic.
>> >>>>>>>>>
>> >>>>>>>>> After a job restart, users could either update the target
>> timestamp
>> >>>> to
>> >>>>>>>>> reset the operator back into the event-time phase, or leave it
>> >>>>>> unchanged
>> >>>>>>>>> to
>> >>>>>>>>> continue operating in the processing-time phase.
>> >>>>>>>>>
>> >>>>>>>>> I completely understand that this would introduce significant
>> >>>>>> complexity
>> >>>>>>>>> to
>> >>>>>>>>> the operator's state management and lifecycle, so this is only a
>> >>>>>>> tentative
>> >>>>>>>>> proposal to explore whether it might be worth considering for
>> the
>> >>>>>>>>> long-term
>> >>>>>>>>> robustness of the design.
>> >>>>>>>>>
>> >>>>>>>>> Best,
>> >>>>>>>>>
>> >>>>>>>>> Xingcan
>> >>>>>>>>>
>> >>>>>>>>> On Thu, May 28, 2026 at 8:17 AM David Anderson <
>> >> [email protected]
>> >>>>>
>> >>>>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> I'm quite enthusiastic about this. I want to thank Fabian for
>> >>>>>> putting
>> >>>>>>>>>> together such a well-crafted FLIP. And I look forward to
>> updating
>> >>>>>> the
>> >>>>>>>>>> awkward educational content this FLIP will make obsolete.
>> >>>>>>>>>>
>> >>>>>>>>>> To my mind, the syntax expresses the semantics of this join
>> rather
>> >>>>>>> well.
>> >>>>>>>>>>
>> >>>>>>>>>> Until now, developers using event-time temporal joins sometimes
>> >>>>>>>>> resorted to
>> >>>>>>>>>> doing weird things with watermarks to handle a build side
>> that's
>> >>>>>>> mostly
>> >>>>>>>>>> idle; this lateral snapshot join is clearly better -- not to
>> >> mention
>> >>>>>>> the
>> >>>>>>>>>> added bonus of pre-loading the build table.
>> >>>>>>>>>>
>> >>>>>>>>>> One question: If I understand correctly, during the JOIN phase
>> of
>> >> an
>> >>>>>>>>> INNER
>> >>>>>>>>>> join, if the desired build-side record is missing, nothing
>> will be
>> >>>>>>>>> emitted
>> >>>>>>>>>> for the unmatched probe-side record. For an INNER join, I can
>> >>>>>> imagine
>> >>>>>>>>>> wanting to buffer unmatched probe-side records, expecting the
>> >> build
>> >>>>>>> side
>> >>>>>>>>>> will arrive soon. What's your thinking there?
>> >>>>>>>>>>
>> >>>>>>>>>> David
>> >>>>>>>>>>
>> >>>>>>>>>> On Wed, May 27, 2026 at 12:44 PM Fabian Hueske <
>> >> [email protected]>
>> >>>>>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>> Thanks Gustavo and Timo for the positive feedback!
>> >>>>>>>>>>>
>> >>>>>>>>>>> I'd like to bump this thread up to collect more feedback.
>> >>>>>>>>>>> If there are no more responses, I will start a vote on this
>> FLIP
>> >>>>>>> next
>> >>>>>>>>>>> Monday, June 1st.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Best, Fabian
>> >>>>>>>>>>>
>> >>>>>>>>>>> Am Do., 21. Mai 2026 um 12:15 Uhr schrieb Timo Walther <
>> >>>>>>>>>> [email protected]
>> >>>>>>>>>>>> :
>> >>>>>>>>>>>
>> >>>>>>>>>>>> Hi Fabian,
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> thanks for proposing this FLIP. I agree that this join is
>> super
>> >>>>>>>>> common,
>> >>>>>>>>>>>> after talking to many people at conferences, I could imagine
>> it
>> >>>>>>>>> will be
>> >>>>>>>>>>>> one of the most used kinds of joins going forward.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Tightly coupling it with watermarks fits both from a
>> semantical
>> >>>>>>>>> point
>> >>>>>>>>>> of
>> >>>>>>>>>>>> view but also with other efforts such as FLIP-558
>> (Improvements
>> >>>>>> to
>> >>>>>>>>>>>> SinkUpsertMaterializer and changelog disorder) [1]. In the
>> near
>> >>>>>>>>> future,
>> >>>>>>>>>>>> we should work on more automated watermarking to power these
>> >>>>>>>>>>>> watermark-based operators, but this is an orthogonal effort.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Overall I'm strongly +1 on this. Also +1 on the syntax
>> >>>>>>> improvements
>> >>>>>>>>> for
>> >>>>>>>>>>>> lateral table functions by dropping the TABLE() wrapper.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Cheers,
>> >>>>>>>>>>>> Timo
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> [1]
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>
>> >>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-558%3A+Improvements+to+SinkUpsertMaterializer+and+changelog+disorder
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> On 18.05.26 11:47, Gustavo de Morais wrote:
>> >>>>>>>>>>>>> Hi Fabian,
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> In general a strong +1 for the feature, without getting into
>> >>>>>> the
>> >>>>>>>>>>> details
>> >>>>>>>>>>>> of
>> >>>>>>>>>>>>> the FLIP yet. This is a missing feature for years and I'm
>> >>>>>> happy
>> >>>>>>>>> that
>> >>>>>>>>>>>> we're
>> >>>>>>>>>>>>> putting the time to address this - while also getting rid of
>> >>>>>>> some
>> >>>>>>>>> of
>> >>>>>>>>>>> the
>> >>>>>>>>>>>>> hard restrictions we had. Thanks!
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Kind regards,
>> >>>>>>>>>>>>> Gustavo
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> On Fri, 15 May 2026 at 16:39, Fabian Hueske <
>> >>>>>> [email protected]
>> >>>>>>>>
>> >>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Hi everyone,
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> I'd like to start a discussion on FLIP-579: LATERAL
>> SNAPSHOT
>> >>>>>>> Join
>> >>>>>>>>>> [1].
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Enriching a stream with data from a (slowly changing)
>> dynamic
>> >>>>>>>>> table
>> >>>>>>>>>>> is a
>> >>>>>>>>>>>>>> super common use case.
>> >>>>>>>>>>>>>> Flink SQL features Temporal Joins [2] to address these use
>> >>>>>>> cases.
>> >>>>>>>>>>>>>> However, SQL users can only use the event-time variant
>> which
>> >>>>>>> has
>> >>>>>>>>>> many
>> >>>>>>>>>>>>>> limitations (heavy dependency on frequent WM updates on
>> both
>> >>>>>>>>> inputs,
>> >>>>>>>>>>>>>> build-side table requires a PK, the join predicate must
>> >>>>>> include
>> >>>>>>>>> the
>> >>>>>>>>>>>>>> build-side PK, etc).
>> >>>>>>>>>>>>>> The processing-time temporal join is disabled (due to
>> >>>>>>> build-side
>> >>>>>>>>>>>>>> initialization issues [3]) and temporal table function
>> joins
>> >>>>>>> are
>> >>>>>>>>>>>>>> only available in Table API.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> FLIP-579 proposes a new temporal join operator that
>> operates
>> >>>>>> in
>> >>>>>>>>>>>>>> processing-time and addresses the limitations of the
>> existing
>> >>>>>>>>>>>>>> implementations:
>> >>>>>>>>>>>>>> * initialization of the build-side before joining
>> >>>>>>>>>>>>>> * no requirement of continuous, frequent build-side WMs
>> >>>>>> (after
>> >>>>>>>>> the
>> >>>>>>>>>>>>>> initialization completed)
>> >>>>>>>>>>>>>> * no requirement for a PK on the build-side
>> >>>>>>>>>>>>>> * table function-based syntax [4] via a built-in SNAPSHOT
>> >>>>>>>>> function
>> >>>>>>>>>>>>>> (proposed in FLIP-517 [4])
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Looking forward to your feedback.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Best,
>> >>>>>>>>>>>>>> Fabian
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> [1]
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>
>> >>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-579%3A+LATERAL+SNAPSHOT+Join
>> >>>>>>>>>>>>>> [2]
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>
>> >>
>> https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/table/sql/queries/joins/#temporal-joins
>> >>>>>>>>>>>>>> [3] https://issues.apache.org/jira/browse/FLINK-19830
>> >>>>>>>>>>>>>> [4]
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>
>> >>
>> https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/table/sql/queries/joins/#temporal-table-function-join
>> >>>>>>>>>>>>>> [5]
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>
>> >>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-517%3A+Better+Handling+of+Dynamic+Table+Primitives+with+PTFs#FLIP517:BetterHandlingofDynamicTablePrimitiveswithPTFs-SNAPSHOTfortemporaljoins
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>
>> >>>>
>> >>
>> >>
>>
>>

Re: [DISCUSS] FLIP-579: LATERAL SNAPSHOT Join

Reply via email to