Thanks for reviewing, Xuyang! Xingcan (@xingc...@gmail.com) – do you have any concerns?
If no further objections arise from anyone, I’ll proceed to mark FLIP as ready for voting. Best regards, Weiqing On Tue, Jan 14, 2025 at 9:06 PM Xuyang <xyzhong...@163.com> wrote: > LGTM overall. Thanks for updating. I have no problem and +1 for this > feature. > > > > > -- > > Best! > Xuyang > > > > > > 在 2025-01-15 12:33:16,"Weiqing Yang" <yangweiqing...@gmail.com> 写道: > >Hi Xuyang, > > > >Thank you for your detailed feedback! I’ve updated the proposal doc > >accordingly. Please feel free to take another look and let me know if you > >have any further thoughts or suggestions. > > > >Best regards, > >Weiqing > > > >On Mon, Jan 13, 2025 at 3:50 AM Xuyang <xyzhong...@163.com> wrote: > > > >> Hi, Weiqing. > >> > >> After reading the new FLIP, I have no issues with the part `public > >> interface`. I only have some questions regarding > >> > >> the details in the Proposed Changes section. > >> > >> Regarding the ModifyKind and UpdateKind of the IntervalJoin node, IIUC: > >> > >> - When early firing is enabled, the UpdateKind of the IntervalJoin can > be > >> either ONLY_UPDATE_AFTER or > >> > >> degrade to BEFORE_AND_AFTER, depending entirely on the requirements of > the > >> sink. And the ModifyKind is always ALL. > >> > >> - When early firing is disabled, the UpdateKind of the IntervalJoin is > >> NONE, and the ModifyKind is INSERT. > >> > >> - Nevertheless, whether early firing is enabled or disabled, the > >> IntervalJoin should always require its input to keep > >> > >> ModifyKind with INSERT_ONLY and UpdateKind with NONE. > >> > >> > >> > >> > >> -- > >> > >> Best! > >> Xuyang > >> > >> > >> > >> > >> > >> At 2025-01-09 15:30:44, "Weiqing Yang" <yangweiqing...@gmail.com> > wrote: > >> >Hi Xingcan and Xuyang, > >> > > >> >Thanks so much for the feedback - it was very helpful! > >> > > >> >*> 1. The current output stream of a time interval outer join is an > >> >append-only stream. This change will make it a potential retractable > >> >stream. I'm not sure if the planner supports a dynamic output type like > >> >that. Could you add this part to your design doc?* > >> > > >> > > >> > - Yes, enabling early firing on time interval outer joins can emit > >> > retractions when previously emitted rows are updated or invalidated > by > >> > later matches. I’ve updated the proposal (Planner Awareness > >> > < > >> > https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.y5w17oloacws > >> > > >> > and Changes in FlinkChangelogModeInferenceProgram > >> > < > >> > https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.z6qdwrvtgn4u > >> >) > >> > to clarify that the stream might switch from append-only to a > >> > retract/upsert stream. Let me know if anything is missing. > >> > > >> > > >> >*> 2. What's the use case when the downstream components need to get > the > >> >early fired results regularly?* > >> > > >> > > >> > - The new INTERVAL option (in addition to DELAY) allows periodic > >> updates > >> > (e.g., every 10 minutes) after the initial delay. This captures how > >> results > >> > evolve over time, similar to Apache Beam’s “Repeatedly” option. > >> > > >> > > >> >*> 3. The time interval join operator itself is not quite efficient > when > >> >the state becomes large. Will there be any extra overhead after > >> introducing > >> >this feature?* > >> > > >> > - Early fire does introduce some overhead by potentially emitting > >> > partial matches multiple times with retraction (avoiding duplicate > >> outputs > >> > though). However, if it’s disabled, there is no additional cost. > Most > >> users > >> > find the performance trade-off acceptable for the real-time > insights it > >> > provides. > >> > > >> > > >> >*> 1. Currently, there are some configs related to early firing > available > >> >to users: `table.exec.emit.early-fire.en**abled` and > >> >`table.exec.emit.early-fire.de <http://table.exec.emit.early-fire.de > >> >**lay`. > >> >Although their documentation states that they are only applicable to > the > >> >Window operator, it seems possible that they may also be relevant in > the > >> >context of this FLIP. Otherwise, having different early firing > behaviors > >> >for different operators could confuse users.* > >> > > >> > - +1 on unifying early-fire behaviors to avoid confusion. I’ve > added a > >> > section > >> > < > >> > https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.rr0i3gmdjt4q > >> > > >> >in > >> > the proposal highlighting that we should align hint-based interval > join > >> > configurations with the existing table.exec.emit.* settings. > >> Suggestions > >> > on how to make the unification are welcome! We plan to extend early > >> firing > >> > to window joins via hints in a future FLIP. > >> > > >> > > >> >*> 2. The design of `time_mode` is excellent. Similar to point 1, > perhaps > >> >we can introduce it to other window-related operators in the future.> > 3. > >> >You need to modify the FlinkChangelogModeInferenceProgram to ensure > that > >> >downstream nodes of interval joins with early firing enabled are aware > of > >> >retract or upsert messages.* > >> > > >> > - We agree that time_mode could be introduced to other window-based > >> > operators down the road. We also want to support early fire for > >> > window join. Also, thanks for highlighting > >> > FlinkChangelogModeInferenceProgram! I added the code change on it > >> > < > >> > https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.z6qdwrvtgn4u > >> > > >> > in the proposal. > >> > > >> > > >> >Thanks again for your time and feedback! I’ve updated the proposal with > >> >these points. Please let me know if there’s anything else I should > >> address. > >> > > >> >Best, > >> >Weiqing > >> > > >> > > >> >On Mon, Jan 6, 2025 at 6:32 PM Xuyang <xyzhong...@163.com> wrote: > >> > > >> >> Hi, Weiqing. Thank you for drafting this FLIP. I have a few > questions: > >> >> > >> >> 1. Currently, there are some configs related to early firing > available > >> to > >> >> users: `table.exec.emit.early-fire.enabled` and > >> >> > >> >> `table.exec.emit.early-fire.delay`. Although their documentation > states > >> >> that they are only applicable to the Window operator, > >> >> > >> >> it seems possible that they may also be relevant in the context of > this > >> >> FLIP. Otherwise, having different early firing behaviors > >> >> > >> >> for different operators could confuse users. > >> >> > >> >> 2. The design of `time_mode` is excellent. Similar to point 1, > perhaps > >> we > >> >> can introduce it to other window-related operators > >> >> > >> >> in the future. > >> >> > >> >> 3. You need to modify the FlinkChangelogModeInferenceProgram to > ensure > >> >> that downstream nodes of interval joins with > >> >> > >> >> early firing enabled are aware of retract or upsert messages. > >> >> > >> >> > >> >> > >> >> > >> >> -- > >> >> > >> >> Best! > >> >> Xuyang > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> At 2025-01-07 06:35:51, "Xingcan Cui" <xingc...@gmail.com> wrote: > >> >> >Hi Weiqing, > >> >> > > >> >> >Thanks for the proposal. IMO, adding early fire for time interval > outer > >> >> >joins is feasible overall. I have a few questions. > >> >> > > >> >> >1. The current output stream of a time interval outer join is an > >> >> >append-only stream. This change will make it a potential retractable > >> >> >stream. I'm not sure if the planner supports a dynamic output type > like > >> >> >that. Could you add this part to your design doc? > >> >> >2. What's the use case when the downstream components need to get > the > >> >> early > >> >> >fired results regularly? > >> >> >3. The time interval join operator itself is not quite efficient > when > >> the > >> >> >state becomes large. Will there be any extra overhead after > introducing > >> >> >this feature? > >> >> > > >> >> >Thanks, > >> >> >Xingcan > >> >> > > >> >> >On Mon, Jan 6, 2025 at 4:11 PM Weiqing Yang < > yangweiqing...@gmail.com> > >> >> >wrote: > >> >> > > >> >> >> Hi all, > >> >> >> > >> >> >> Just a gentle reminder regarding the proposal I shared on early > fire > >> >> >> support for Flink SQL interval joins. I’d greatly appreciate your > >> >> feedback > >> >> >> or suggestions. > >> >> >> > >> >> >> Here’s the link to the proposal document: Link > >> >> >> < > >> >> >> > >> >> > >> > https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.z7bl0h2hwkph > >> >> >> > > >> >> >> > >> >> >> Thank you! > >> >> >> > >> >> >> Best, > >> >> >> Weiqing > >> >> >> > >> >> >> On Sun, Dec 22, 2024 at 11:19 PM Weiqing Yang < > >> yangweiqing...@gmail.com > >> >> > > >> >> >> wrote: > >> >> >> > >> >> >> > Hi all, > >> >> >> > > >> >> >> > I’d like to initiate a discussion about introducing early fire > >> support > >> >> >> for > >> >> >> > Flink SQL interval joins. > >> >> >> > > >> >> >> > *Motivation* > >> >> >> > In many streaming applications, particularly real-time analytics > >> and > >> >> >> > monitoring systems, it is valuable to obtain partial results > >> earlier > >> >> >> rather > >> >> >> > than waiting for full join conditions to finalize. For Flink SQL > >> >> interval > >> >> >> > joins, results are typically delayed until watermarks ensure no > >> more > >> >> >> > matches can occur. This delay can be challenging for scenarios > that > >> >> >> require > >> >> >> > fast feedback. Early fire support addresses this by emitting > >> >> intermediate > >> >> >> > results speculatively and using retractions or updates to > maintain > >> >> >> eventual > >> >> >> > consistency and ensure correctness. > >> >> >> > > >> >> >> > Here’s the proposal document: Link > >> >> >> > < > >> >> >> > >> >> > >> > https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.z7bl0h2hwkph > >> >> >> > > >> >> >> > > >> >> >> > Your feedback and ideas are welcome to refine this feature. > >> >> >> > > >> >> >> > Thanks, > >> >> >> > Weiqing > >> >> >> > > >> >> >> > >> >> > >> >