Weiqing, Thanks, this is a great addition to Flink SQL. Also instead of controlling and configuring through Flink configs unlike the older window aggregation, hints seems to be a much better approach. This enables a customizable early fire behavior for individual interval joins.
Couple of questions: 1. Does the *early fire* emit an output every earlyFireInterval time or will it be a one time output emission and another output emitted at the end of the interval? 2. Are there plans to support *late fire *similar to the window aggregations in later FLIPs? Regards Venkata krishnan On Wed, Jan 15, 2025 at 6:16 PM Weiqing Yang <yangweiqing...@gmail.com> wrote: > Thanks for reviewing, Xuyang! > > Xingcan (@xingc...@gmail.com) – do you have any concerns? > > If no further objections arise from anyone, I’ll proceed to mark FLIP as > ready for voting. > > Best regards, > Weiqing > > On Tue, Jan 14, 2025 at 9:06 PM Xuyang <xyzhong...@163.com> wrote: > > > LGTM overall. Thanks for updating. I have no problem and +1 for this > > feature. > > > > > > > > > > -- > > > > Best! > > Xuyang > > > > > > > > > > > > 在 2025-01-15 12:33:16,"Weiqing Yang" <yangweiqing...@gmail.com> 写道: > > >Hi Xuyang, > > > > > >Thank you for your detailed feedback! I’ve updated the proposal doc > > >accordingly. Please feel free to take another look and let me know if > you > > >have any further thoughts or suggestions. > > > > > >Best regards, > > >Weiqing > > > > > >On Mon, Jan 13, 2025 at 3:50 AM Xuyang <xyzhong...@163.com> wrote: > > > > > >> Hi, Weiqing. > > >> > > >> After reading the new FLIP, I have no issues with the part `public > > >> interface`. I only have some questions regarding > > >> > > >> the details in the Proposed Changes section. > > >> > > >> Regarding the ModifyKind and UpdateKind of the IntervalJoin node, > IIUC: > > >> > > >> - When early firing is enabled, the UpdateKind of the IntervalJoin can > > be > > >> either ONLY_UPDATE_AFTER or > > >> > > >> degrade to BEFORE_AND_AFTER, depending entirely on the requirements of > > the > > >> sink. And the ModifyKind is always ALL. > > >> > > >> - When early firing is disabled, the UpdateKind of the IntervalJoin is > > >> NONE, and the ModifyKind is INSERT. > > >> > > >> - Nevertheless, whether early firing is enabled or disabled, the > > >> IntervalJoin should always require its input to keep > > >> > > >> ModifyKind with INSERT_ONLY and UpdateKind with NONE. > > >> > > >> > > >> > > >> > > >> -- > > >> > > >> Best! > > >> Xuyang > > >> > > >> > > >> > > >> > > >> > > >> At 2025-01-09 15:30:44, "Weiqing Yang" <yangweiqing...@gmail.com> > > wrote: > > >> >Hi Xingcan and Xuyang, > > >> > > > >> >Thanks so much for the feedback - it was very helpful! > > >> > > > >> >*> 1. The current output stream of a time interval outer join is an > > >> >append-only stream. This change will make it a potential retractable > > >> >stream. I'm not sure if the planner supports a dynamic output type > like > > >> >that. Could you add this part to your design doc?* > > >> > > > >> > > > >> > - Yes, enabling early firing on time interval outer joins can emit > > >> > retractions when previously emitted rows are updated or > invalidated > > by > > >> > later matches. I’ve updated the proposal (Planner Awareness > > >> > < > > >> > > > https://urldefense.com/v3/__https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0*heading=h.y5w17oloacws__;Iw!!IKRxdwAv5BmarQ!amkTjCPG108LnMxlN_eVP1GHgJpGNcvNJWSNr3NMfIoj0hTe4LvEKnFk0_gDXV0W-hozAXm9Kxw9VrlRT3jQ-WAM59Os$ > > >> > > > >> > and Changes in FlinkChangelogModeInferenceProgram > > >> > < > > >> > > > https://urldefense.com/v3/__https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0*heading=h.z6qdwrvtgn4u__;Iw!!IKRxdwAv5BmarQ!amkTjCPG108LnMxlN_eVP1GHgJpGNcvNJWSNr3NMfIoj0hTe4LvEKnFk0_gDXV0W-hozAXm9Kxw9VrlRT3jQ-Y5SiJXB$ > > >> >) > > >> > to clarify that the stream might switch from append-only to a > > >> > retract/upsert stream. Let me know if anything is missing. > > >> > > > >> > > > >> >*> 2. What's the use case when the downstream components need to get > > the > > >> >early fired results regularly?* > > >> > > > >> > > > >> > - The new INTERVAL option (in addition to DELAY) allows periodic > > >> updates > > >> > (e.g., every 10 minutes) after the initial delay. This captures > how > > >> results > > >> > evolve over time, similar to Apache Beam’s “Repeatedly” option. > > >> > > > >> > > > >> >*> 3. The time interval join operator itself is not quite efficient > > when > > >> >the state becomes large. Will there be any extra overhead after > > >> introducing > > >> >this feature?* > > >> > > > >> > - Early fire does introduce some overhead by potentially emitting > > >> > partial matches multiple times with retraction (avoiding duplicate > > >> outputs > > >> > though). However, if it’s disabled, there is no additional cost. > > Most > > >> users > > >> > find the performance trade-off acceptable for the real-time > > insights it > > >> > provides. > > >> > > > >> > > > >> >*> 1. Currently, there are some configs related to early firing > > available > > >> >to users: `table.exec.emit.early-fire.en**abled` and > > >> >`table.exec.emit.early-fire.de < > https://urldefense.com/v3/__http://table.exec.emit.early-fire.de__;!!IKRxdwAv5BmarQ!amkTjCPG108LnMxlN_eVP1GHgJpGNcvNJWSNr3NMfIoj0hTe4LvEKnFk0_gDXV0W-hozAXm9Kxw9VrlRT3jQ-dmB0JB7$ > > >> >**lay`. > > >> >Although their documentation states that they are only applicable to > > the > > >> >Window operator, it seems possible that they may also be relevant in > > the > > >> >context of this FLIP. Otherwise, having different early firing > > behaviors > > >> >for different operators could confuse users.* > > >> > > > >> > - +1 on unifying early-fire behaviors to avoid confusion. I’ve > > added a > > >> > section > > >> > < > > >> > > > https://urldefense.com/v3/__https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0*heading=h.rr0i3gmdjt4q__;Iw!!IKRxdwAv5BmarQ!amkTjCPG108LnMxlN_eVP1GHgJpGNcvNJWSNr3NMfIoj0hTe4LvEKnFk0_gDXV0W-hozAXm9Kxw9VrlRT3jQ-cs7f7P2$ > > >> > > > >> >in > > >> > the proposal highlighting that we should align hint-based interval > > join > > >> > configurations with the existing table.exec.emit.* settings. > > >> Suggestions > > >> > on how to make the unification are welcome! We plan to extend > early > > >> firing > > >> > to window joins via hints in a future FLIP. > > >> > > > >> > > > >> >*> 2. The design of `time_mode` is excellent. Similar to point 1, > > perhaps > > >> >we can introduce it to other window-related operators in the future.> > > 3. > > >> >You need to modify the FlinkChangelogModeInferenceProgram to ensure > > that > > >> >downstream nodes of interval joins with early firing enabled are > aware > > of > > >> >retract or upsert messages.* > > >> > > > >> > - We agree that time_mode could be introduced to other > window-based > > >> > operators down the road. We also want to support early fire for > > >> > window join. Also, thanks for highlighting > > >> > FlinkChangelogModeInferenceProgram! I added the code change on it > > >> > < > > >> > > > https://urldefense.com/v3/__https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0*heading=h.z6qdwrvtgn4u__;Iw!!IKRxdwAv5BmarQ!amkTjCPG108LnMxlN_eVP1GHgJpGNcvNJWSNr3NMfIoj0hTe4LvEKnFk0_gDXV0W-hozAXm9Kxw9VrlRT3jQ-Y5SiJXB$ > > >> > > > >> > in the proposal. > > >> > > > >> > > > >> >Thanks again for your time and feedback! I’ve updated the proposal > with > > >> >these points. Please let me know if there’s anything else I should > > >> address. > > >> > > > >> >Best, > > >> >Weiqing > > >> > > > >> > > > >> >On Mon, Jan 6, 2025 at 6:32 PM Xuyang <xyzhong...@163.com> wrote: > > >> > > > >> >> Hi, Weiqing. Thank you for drafting this FLIP. I have a few > > questions: > > >> >> > > >> >> 1. Currently, there are some configs related to early firing > > available > > >> to > > >> >> users: `table.exec.emit.early-fire.enabled` and > > >> >> > > >> >> `table.exec.emit.early-fire.delay`. Although their documentation > > states > > >> >> that they are only applicable to the Window operator, > > >> >> > > >> >> it seems possible that they may also be relevant in the context of > > this > > >> >> FLIP. Otherwise, having different early firing behaviors > > >> >> > > >> >> for different operators could confuse users. > > >> >> > > >> >> 2. The design of `time_mode` is excellent. Similar to point 1, > > perhaps > > >> we > > >> >> can introduce it to other window-related operators > > >> >> > > >> >> in the future. > > >> >> > > >> >> 3. You need to modify the FlinkChangelogModeInferenceProgram to > > ensure > > >> >> that downstream nodes of interval joins with > > >> >> > > >> >> early firing enabled are aware of retract or upsert messages. > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> -- > > >> >> > > >> >> Best! > > >> >> Xuyang > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> At 2025-01-07 06:35:51, "Xingcan Cui" <xingc...@gmail.com> wrote: > > >> >> >Hi Weiqing, > > >> >> > > > >> >> >Thanks for the proposal. IMO, adding early fire for time interval > > outer > > >> >> >joins is feasible overall. I have a few questions. > > >> >> > > > >> >> >1. The current output stream of a time interval outer join is an > > >> >> >append-only stream. This change will make it a potential > retractable > > >> >> >stream. I'm not sure if the planner supports a dynamic output type > > like > > >> >> >that. Could you add this part to your design doc? > > >> >> >2. What's the use case when the downstream components need to get > > the > > >> >> early > > >> >> >fired results regularly? > > >> >> >3. The time interval join operator itself is not quite efficient > > when > > >> the > > >> >> >state becomes large. Will there be any extra overhead after > > introducing > > >> >> >this feature? > > >> >> > > > >> >> >Thanks, > > >> >> >Xingcan > > >> >> > > > >> >> >On Mon, Jan 6, 2025 at 4:11 PM Weiqing Yang < > > yangweiqing...@gmail.com> > > >> >> >wrote: > > >> >> > > > >> >> >> Hi all, > > >> >> >> > > >> >> >> Just a gentle reminder regarding the proposal I shared on early > > fire > > >> >> >> support for Flink SQL interval joins. I’d greatly appreciate > your > > >> >> feedback > > >> >> >> or suggestions. > > >> >> >> > > >> >> >> Here’s the link to the proposal document: Link > > >> >> >> < > > >> >> >> > > >> >> > > >> > > > https://urldefense.com/v3/__https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0*heading=h.z7bl0h2hwkph__;Iw!!IKRxdwAv5BmarQ!amkTjCPG108LnMxlN_eVP1GHgJpGNcvNJWSNr3NMfIoj0hTe4LvEKnFk0_gDXV0W-hozAXm9Kxw9VrlRT3jQ-ZfECmzD$ > > >> >> >> > > > >> >> >> > > >> >> >> Thank you! > > >> >> >> > > >> >> >> Best, > > >> >> >> Weiqing > > >> >> >> > > >> >> >> On Sun, Dec 22, 2024 at 11:19 PM Weiqing Yang < > > >> yangweiqing...@gmail.com > > >> >> > > > >> >> >> wrote: > > >> >> >> > > >> >> >> > Hi all, > > >> >> >> > > > >> >> >> > I’d like to initiate a discussion about introducing early fire > > >> support > > >> >> >> for > > >> >> >> > Flink SQL interval joins. > > >> >> >> > > > >> >> >> > *Motivation* > > >> >> >> > In many streaming applications, particularly real-time > analytics > > >> and > > >> >> >> > monitoring systems, it is valuable to obtain partial results > > >> earlier > > >> >> >> rather > > >> >> >> > than waiting for full join conditions to finalize. For Flink > SQL > > >> >> interval > > >> >> >> > joins, results are typically delayed until watermarks ensure > no > > >> more > > >> >> >> > matches can occur. This delay can be challenging for scenarios > > that > > >> >> >> require > > >> >> >> > fast feedback. Early fire support addresses this by emitting > > >> >> intermediate > > >> >> >> > results speculatively and using retractions or updates to > > maintain > > >> >> >> eventual > > >> >> >> > consistency and ensure correctness. > > >> >> >> > > > >> >> >> > Here’s the proposal document: Link > > >> >> >> > < > > >> >> >> > > >> >> > > >> > > > https://urldefense.com/v3/__https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0*heading=h.z7bl0h2hwkph__;Iw!!IKRxdwAv5BmarQ!amkTjCPG108LnMxlN_eVP1GHgJpGNcvNJWSNr3NMfIoj0hTe4LvEKnFk0_gDXV0W-hozAXm9Kxw9VrlRT3jQ-ZfECmzD$ > > >> >> >> > > > >> >> >> > > > >> >> >> > Your feedback and ideas are welcome to refine this feature. > > >> >> >> > > > >> >> >> > Thanks, > > >> >> >> > Weiqing > > >> >> >> > > > >> >> >> > > >> >> > > >> > > >