Hi Venkata, Thanks for your feedback!
If the optional `interval` in the proposal is enabled, early-fire outputs will occur repeatedly at every earlyFireInterval, not just once. After that interval concludes, there will also be a final emission. We haven’t included a late-fire mechanism for interval joins in this FLIP, but it’s certainly something we can consider for future efforts! Best regards, Weiqing On Thu, Jan 16, 2025 at 3:27 PM Venkatakrishnan Sowrirajan <vsowr...@asu.edu> wrote: > Weiqing, > > Thanks, this is a great addition to Flink SQL. Also instead of controlling > and configuring through Flink configs unlike the older window aggregation, > hints seems to be a much better approach. This enables a customizable early > fire behavior for individual interval joins. > > Couple of questions: > > 1. Does the *early fire* emit an output every earlyFireInterval time or > will it be a one time output emission and another output emitted at the end > of the interval? > 2. Are there plans to support *late fire *similar to the window > aggregations in later FLIPs? > > Regards > Venkata krishnan > > > On Wed, Jan 15, 2025 at 6:16 PM Weiqing Yang <yangweiqing...@gmail.com> > wrote: > > > Thanks for reviewing, Xuyang! > > > > Xingcan (@xingc...@gmail.com) – do you have any concerns? > > > > If no further objections arise from anyone, I’ll proceed to mark FLIP as > > ready for voting. > > > > Best regards, > > Weiqing > > > > On Tue, Jan 14, 2025 at 9:06 PM Xuyang <xyzhong...@163.com> wrote: > > > > > LGTM overall. Thanks for updating. I have no problem and +1 for this > > > feature. > > > > > > > > > > > > > > > -- > > > > > > Best! > > > Xuyang > > > > > > > > > > > > > > > > > > 在 2025-01-15 12:33:16,"Weiqing Yang" <yangweiqing...@gmail.com> 写道: > > > >Hi Xuyang, > > > > > > > >Thank you for your detailed feedback! I’ve updated the proposal doc > > > >accordingly. Please feel free to take another look and let me know if > > you > > > >have any further thoughts or suggestions. > > > > > > > >Best regards, > > > >Weiqing > > > > > > > >On Mon, Jan 13, 2025 at 3:50 AM Xuyang <xyzhong...@163.com> wrote: > > > > > > > >> Hi, Weiqing. > > > >> > > > >> After reading the new FLIP, I have no issues with the part `public > > > >> interface`. I only have some questions regarding > > > >> > > > >> the details in the Proposed Changes section. > > > >> > > > >> Regarding the ModifyKind and UpdateKind of the IntervalJoin node, > > IIUC: > > > >> > > > >> - When early firing is enabled, the UpdateKind of the IntervalJoin > can > > > be > > > >> either ONLY_UPDATE_AFTER or > > > >> > > > >> degrade to BEFORE_AND_AFTER, depending entirely on the requirements > of > > > the > > > >> sink. And the ModifyKind is always ALL. > > > >> > > > >> - When early firing is disabled, the UpdateKind of the IntervalJoin > is > > > >> NONE, and the ModifyKind is INSERT. > > > >> > > > >> - Nevertheless, whether early firing is enabled or disabled, the > > > >> IntervalJoin should always require its input to keep > > > >> > > > >> ModifyKind with INSERT_ONLY and UpdateKind with NONE. > > > >> > > > >> > > > >> > > > >> > > > >> -- > > > >> > > > >> Best! > > > >> Xuyang > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> At 2025-01-09 15:30:44, "Weiqing Yang" <yangweiqing...@gmail.com> > > > wrote: > > > >> >Hi Xingcan and Xuyang, > > > >> > > > > >> >Thanks so much for the feedback - it was very helpful! > > > >> > > > > >> >*> 1. The current output stream of a time interval outer join is an > > > >> >append-only stream. This change will make it a potential > retractable > > > >> >stream. I'm not sure if the planner supports a dynamic output type > > like > > > >> >that. Could you add this part to your design doc?* > > > >> > > > > >> > > > > >> > - Yes, enabling early firing on time interval outer joins can > emit > > > >> > retractions when previously emitted rows are updated or > > invalidated > > > by > > > >> > later matches. I’ve updated the proposal (Planner Awareness > > > >> > < > > > >> > > > > > > https://urldefense.com/v3/__https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0*heading=h.y5w17oloacws__;Iw!!IKRxdwAv5BmarQ!amkTjCPG108LnMxlN_eVP1GHgJpGNcvNJWSNr3NMfIoj0hTe4LvEKnFk0_gDXV0W-hozAXm9Kxw9VrlRT3jQ-WAM59Os$ > > > >> > > > > >> > and Changes in FlinkChangelogModeInferenceProgram > > > >> > < > > > >> > > > > > > https://urldefense.com/v3/__https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0*heading=h.z6qdwrvtgn4u__;Iw!!IKRxdwAv5BmarQ!amkTjCPG108LnMxlN_eVP1GHgJpGNcvNJWSNr3NMfIoj0hTe4LvEKnFk0_gDXV0W-hozAXm9Kxw9VrlRT3jQ-Y5SiJXB$ > > > >> >) > > > >> > to clarify that the stream might switch from append-only to a > > > >> > retract/upsert stream. Let me know if anything is missing. > > > >> > > > > >> > > > > >> >*> 2. What's the use case when the downstream components need to > get > > > the > > > >> >early fired results regularly?* > > > >> > > > > >> > > > > >> > - The new INTERVAL option (in addition to DELAY) allows periodic > > > >> updates > > > >> > (e.g., every 10 minutes) after the initial delay. This captures > > how > > > >> results > > > >> > evolve over time, similar to Apache Beam’s “Repeatedly” option. > > > >> > > > > >> > > > > >> >*> 3. The time interval join operator itself is not quite efficient > > > when > > > >> >the state becomes large. Will there be any extra overhead after > > > >> introducing > > > >> >this feature?* > > > >> > > > > >> > - Early fire does introduce some overhead by potentially > emitting > > > >> > partial matches multiple times with retraction (avoiding > duplicate > > > >> outputs > > > >> > though). However, if it’s disabled, there is no additional cost. > > > Most > > > >> users > > > >> > find the performance trade-off acceptable for the real-time > > > insights it > > > >> > provides. > > > >> > > > > >> > > > > >> >*> 1. Currently, there are some configs related to early firing > > > available > > > >> >to users: `table.exec.emit.early-fire.en**abled` and > > > >> >`table.exec.emit.early-fire.de < > > > https://urldefense.com/v3/__http://table.exec.emit.early-fire.de__;!!IKRxdwAv5BmarQ!amkTjCPG108LnMxlN_eVP1GHgJpGNcvNJWSNr3NMfIoj0hTe4LvEKnFk0_gDXV0W-hozAXm9Kxw9VrlRT3jQ-dmB0JB7$ > > > >> >**lay`. > > > >> >Although their documentation states that they are only applicable > to > > > the > > > >> >Window operator, it seems possible that they may also be relevant > in > > > the > > > >> >context of this FLIP. Otherwise, having different early firing > > > behaviors > > > >> >for different operators could confuse users.* > > > >> > > > > >> > - +1 on unifying early-fire behaviors to avoid confusion. I’ve > > > added a > > > >> > section > > > >> > < > > > >> > > > > > > https://urldefense.com/v3/__https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0*heading=h.rr0i3gmdjt4q__;Iw!!IKRxdwAv5BmarQ!amkTjCPG108LnMxlN_eVP1GHgJpGNcvNJWSNr3NMfIoj0hTe4LvEKnFk0_gDXV0W-hozAXm9Kxw9VrlRT3jQ-cs7f7P2$ > > > >> > > > > >> >in > > > >> > the proposal highlighting that we should align hint-based > interval > > > join > > > >> > configurations with the existing table.exec.emit.* settings. > > > >> Suggestions > > > >> > on how to make the unification are welcome! We plan to extend > > early > > > >> firing > > > >> > to window joins via hints in a future FLIP. > > > >> > > > > >> > > > > >> >*> 2. The design of `time_mode` is excellent. Similar to point 1, > > > perhaps > > > >> >we can introduce it to other window-related operators in the > future.> > > > 3. > > > >> >You need to modify the FlinkChangelogModeInferenceProgram to ensure > > > that > > > >> >downstream nodes of interval joins with early firing enabled are > > aware > > > of > > > >> >retract or upsert messages.* > > > >> > > > > >> > - We agree that time_mode could be introduced to other > > window-based > > > >> > operators down the road. We also want to support early fire for > > > >> > window join. Also, thanks for highlighting > > > >> > FlinkChangelogModeInferenceProgram! I added the code change on > it > > > >> > < > > > >> > > > > > > https://urldefense.com/v3/__https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0*heading=h.z6qdwrvtgn4u__;Iw!!IKRxdwAv5BmarQ!amkTjCPG108LnMxlN_eVP1GHgJpGNcvNJWSNr3NMfIoj0hTe4LvEKnFk0_gDXV0W-hozAXm9Kxw9VrlRT3jQ-Y5SiJXB$ > > > >> > > > > >> > in the proposal. > > > >> > > > > >> > > > > >> >Thanks again for your time and feedback! I’ve updated the proposal > > with > > > >> >these points. Please let me know if there’s anything else I should > > > >> address. > > > >> > > > > >> >Best, > > > >> >Weiqing > > > >> > > > > >> > > > > >> >On Mon, Jan 6, 2025 at 6:32 PM Xuyang <xyzhong...@163.com> wrote: > > > >> > > > > >> >> Hi, Weiqing. Thank you for drafting this FLIP. I have a few > > > questions: > > > >> >> > > > >> >> 1. Currently, there are some configs related to early firing > > > available > > > >> to > > > >> >> users: `table.exec.emit.early-fire.enabled` and > > > >> >> > > > >> >> `table.exec.emit.early-fire.delay`. Although their documentation > > > states > > > >> >> that they are only applicable to the Window operator, > > > >> >> > > > >> >> it seems possible that they may also be relevant in the context > of > > > this > > > >> >> FLIP. Otherwise, having different early firing behaviors > > > >> >> > > > >> >> for different operators could confuse users. > > > >> >> > > > >> >> 2. The design of `time_mode` is excellent. Similar to point 1, > > > perhaps > > > >> we > > > >> >> can introduce it to other window-related operators > > > >> >> > > > >> >> in the future. > > > >> >> > > > >> >> 3. You need to modify the FlinkChangelogModeInferenceProgram to > > > ensure > > > >> >> that downstream nodes of interval joins with > > > >> >> > > > >> >> early firing enabled are aware of retract or upsert messages. > > > >> >> > > > >> >> > > > >> >> > > > >> >> > > > >> >> -- > > > >> >> > > > >> >> Best! > > > >> >> Xuyang > > > >> >> > > > >> >> > > > >> >> > > > >> >> > > > >> >> > > > >> >> At 2025-01-07 06:35:51, "Xingcan Cui" <xingc...@gmail.com> > wrote: > > > >> >> >Hi Weiqing, > > > >> >> > > > > >> >> >Thanks for the proposal. IMO, adding early fire for time > interval > > > outer > > > >> >> >joins is feasible overall. I have a few questions. > > > >> >> > > > > >> >> >1. The current output stream of a time interval outer join is an > > > >> >> >append-only stream. This change will make it a potential > > retractable > > > >> >> >stream. I'm not sure if the planner supports a dynamic output > type > > > like > > > >> >> >that. Could you add this part to your design doc? > > > >> >> >2. What's the use case when the downstream components need to > get > > > the > > > >> >> early > > > >> >> >fired results regularly? > > > >> >> >3. The time interval join operator itself is not quite efficient > > > when > > > >> the > > > >> >> >state becomes large. Will there be any extra overhead after > > > introducing > > > >> >> >this feature? > > > >> >> > > > > >> >> >Thanks, > > > >> >> >Xingcan > > > >> >> > > > > >> >> >On Mon, Jan 6, 2025 at 4:11 PM Weiqing Yang < > > > yangweiqing...@gmail.com> > > > >> >> >wrote: > > > >> >> > > > > >> >> >> Hi all, > > > >> >> >> > > > >> >> >> Just a gentle reminder regarding the proposal I shared on > early > > > fire > > > >> >> >> support for Flink SQL interval joins. I’d greatly appreciate > > your > > > >> >> feedback > > > >> >> >> or suggestions. > > > >> >> >> > > > >> >> >> Here’s the link to the proposal document: Link > > > >> >> >> < > > > >> >> >> > > > >> >> > > > >> > > > > > > https://urldefense.com/v3/__https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0*heading=h.z7bl0h2hwkph__;Iw!!IKRxdwAv5BmarQ!amkTjCPG108LnMxlN_eVP1GHgJpGNcvNJWSNr3NMfIoj0hTe4LvEKnFk0_gDXV0W-hozAXm9Kxw9VrlRT3jQ-ZfECmzD$ > > > >> >> >> > > > > >> >> >> > > > >> >> >> Thank you! > > > >> >> >> > > > >> >> >> Best, > > > >> >> >> Weiqing > > > >> >> >> > > > >> >> >> On Sun, Dec 22, 2024 at 11:19 PM Weiqing Yang < > > > >> yangweiqing...@gmail.com > > > >> >> > > > > >> >> >> wrote: > > > >> >> >> > > > >> >> >> > Hi all, > > > >> >> >> > > > > >> >> >> > I’d like to initiate a discussion about introducing early > fire > > > >> support > > > >> >> >> for > > > >> >> >> > Flink SQL interval joins. > > > >> >> >> > > > > >> >> >> > *Motivation* > > > >> >> >> > In many streaming applications, particularly real-time > > analytics > > > >> and > > > >> >> >> > monitoring systems, it is valuable to obtain partial results > > > >> earlier > > > >> >> >> rather > > > >> >> >> > than waiting for full join conditions to finalize. For Flink > > SQL > > > >> >> interval > > > >> >> >> > joins, results are typically delayed until watermarks ensure > > no > > > >> more > > > >> >> >> > matches can occur. This delay can be challenging for > scenarios > > > that > > > >> >> >> require > > > >> >> >> > fast feedback. Early fire support addresses this by emitting > > > >> >> intermediate > > > >> >> >> > results speculatively and using retractions or updates to > > > maintain > > > >> >> >> eventual > > > >> >> >> > consistency and ensure correctness. > > > >> >> >> > > > > >> >> >> > Here’s the proposal document: Link > > > >> >> >> > < > > > >> >> >> > > > >> >> > > > >> > > > > > > https://urldefense.com/v3/__https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0*heading=h.z7bl0h2hwkph__;Iw!!IKRxdwAv5BmarQ!amkTjCPG108LnMxlN_eVP1GHgJpGNcvNJWSNr3NMfIoj0hTe4LvEKnFk0_gDXV0W-hozAXm9Kxw9VrlRT3jQ-ZfECmzD$ > > > >> >> >> > > > > >> >> >> > > > > >> >> >> > Your feedback and ideas are welcome to refine this feature. > > > >> >> >> > > > > >> >> >> > Thanks, > > > >> >> >> > Weiqing > > > >> >> >> > > > > >> >> >> > > > >> >> > > > >> > > > > > >