Hi Xingcan and Xuyang,

Thanks so much for the feedback - it was very helpful!

*> 1. The current output stream of a time interval outer join is an
append-only stream. This change will make it a potential retractable
stream. I'm not sure if the planner supports a dynamic output type like
that. Could you add this part to your design doc?*


   - Yes, enabling early firing on time interval outer joins can emit
   retractions when previously emitted rows are updated or invalidated by
   later matches. I’ve updated the proposal (Planner Awareness
   
<https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.y5w17oloacws>
   and Changes in FlinkChangelogModeInferenceProgram
   
<https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.z6qdwrvtgn4u>)
   to clarify that the stream might switch from append-only to a
   retract/upsert stream. Let me know if anything is missing.


*> 2. What's the use case when the downstream components need to get the
early fired results regularly?*


   - The new INTERVAL option (in addition to DELAY) allows periodic updates
   (e.g., every 10 minutes) after the initial delay. This captures how results
   evolve over time, similar to Apache Beam’s “Repeatedly” option.


*> 3. The time interval join operator itself is not quite efficient when
the state becomes large. Will there be any extra overhead after introducing
this feature?*

   - Early fire does introduce some overhead by potentially emitting
   partial matches multiple times with retraction (avoiding duplicate outputs
   though). However, if it’s disabled, there is no additional cost. Most users
   find the performance trade-off acceptable for the real-time insights it
   provides.


*> 1. Currently, there are some configs related to early firing available
to users: `table.exec.emit.early-fire.en**abled` and
`table.exec.emit.early-fire.de <http://table.exec.emit.early-fire.de>**lay`.
Although their documentation states that they are only applicable to the
Window operator, it seems possible that they may also be relevant in the
context of this FLIP. Otherwise, having different early firing behaviors
for different operators could confuse users.*

   - +1 on unifying early-fire behaviors to avoid confusion. I’ve added a
   section
   
<https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.rr0i3gmdjt4q>
in
   the proposal highlighting that we should align hint-based interval join
   configurations with the existing table.exec.emit.* settings. Suggestions
   on how to make the unification are welcome! We plan to extend early firing
   to window joins via hints in a future FLIP.


*> 2. The design of `time_mode` is excellent. Similar to point 1, perhaps
we can introduce it to other window-related operators in the future.> 3.
You need to modify the FlinkChangelogModeInferenceProgram to ensure that
downstream nodes of interval joins with early firing enabled are aware of
retract or upsert messages.*

   - We agree that time_mode could be introduced to other window-based
   operators down the road. We also want to support early fire for
   window join. Also, thanks for highlighting
   FlinkChangelogModeInferenceProgram! I added the code change on it
   
<https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.z6qdwrvtgn4u>
   in the proposal.


Thanks again for your time and feedback! I’ve updated the proposal with
these points. Please let me know if there’s anything else I should address.

Best,
Weiqing


On Mon, Jan 6, 2025 at 6:32 PM Xuyang <xyzhong...@163.com> wrote:

> Hi, Weiqing. Thank you for drafting this FLIP. I have a few questions:
>
> 1. Currently, there are some configs related to early firing available to
> users: `table.exec.emit.early-fire.enabled` and
>
> `table.exec.emit.early-fire.delay`. Although their documentation states
> that they are only applicable to the Window operator,
>
> it seems possible that they may also be relevant in the context of this
> FLIP. Otherwise, having different early firing behaviors
>
> for different operators could confuse users.
>
> 2. The design of `time_mode` is excellent. Similar to point 1, perhaps we
> can introduce it to other window-related operators
>
> in the future.
>
> 3. You need to modify the FlinkChangelogModeInferenceProgram to ensure
> that downstream nodes of interval joins with
>
> early firing enabled are aware of retract or upsert messages.
>
>
>
>
> --
>
>     Best!
>     Xuyang
>
>
>
>
>
> At 2025-01-07 06:35:51, "Xingcan Cui" <xingc...@gmail.com> wrote:
> >Hi Weiqing,
> >
> >Thanks for the proposal. IMO, adding early fire for time interval outer
> >joins is feasible overall. I have a few questions.
> >
> >1. The current output stream of a time interval outer join is an
> >append-only stream. This change will make it a potential retractable
> >stream. I'm not sure if the planner supports a dynamic output type like
> >that. Could you add this part to your design doc?
> >2. What's the use case when the downstream components need to get the
> early
> >fired results regularly?
> >3. The time interval join operator itself is not quite efficient when the
> >state becomes large. Will there be any extra overhead after introducing
> >this feature?
> >
> >Thanks,
> >Xingcan
> >
> >On Mon, Jan 6, 2025 at 4:11 PM Weiqing Yang <yangweiqing...@gmail.com>
> >wrote:
> >
> >> Hi all,
> >>
> >> Just a gentle reminder regarding the proposal I shared on early fire
> >> support for Flink SQL interval joins. I’d greatly appreciate your
> feedback
> >> or suggestions.
> >>
> >> Here’s the link to the proposal document: Link
> >> <
> >>
> https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.z7bl0h2hwkph
> >> >
> >>
> >> Thank you!
> >>
> >> Best,
> >> Weiqing
> >>
> >> On Sun, Dec 22, 2024 at 11:19 PM Weiqing Yang <yangweiqing...@gmail.com
> >
> >> wrote:
> >>
> >> > Hi all,
> >> >
> >> > I’d like to initiate a discussion about introducing early fire support
> >> for
> >> > Flink SQL interval joins.
> >> >
> >> > *Motivation*
> >> > In many streaming applications, particularly real-time analytics and
> >> > monitoring systems, it is valuable to obtain partial results earlier
> >> rather
> >> > than waiting for full join conditions to finalize. For Flink SQL
> interval
> >> > joins, results are typically delayed until watermarks ensure no more
> >> > matches can occur. This delay can be challenging for scenarios that
> >> require
> >> > fast feedback. Early fire support addresses this by emitting
> intermediate
> >> > results speculatively and using retractions or updates to maintain
> >> eventual
> >> > consistency and ensure correctness.
> >> >
> >> > Here’s the proposal document: Link
> >> > <
> >>
> https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.z7bl0h2hwkph
> >> >
> >> >
> >> > Your feedback and ideas are welcome to refine this feature.
> >> >
> >> > Thanks,
> >> > Weiqing
> >> >
> >>
>

Reply via email to