LGTM overall. Thanks for updating. I have no problem and +1 for this feature.
-- Best! Xuyang 在 2025-01-15 12:33:16,"Weiqing Yang" <yangweiqing...@gmail.com> 写道: >Hi Xuyang, > >Thank you for your detailed feedback! I’ve updated the proposal doc >accordingly. Please feel free to take another look and let me know if you >have any further thoughts or suggestions. > >Best regards, >Weiqing > >On Mon, Jan 13, 2025 at 3:50 AM Xuyang <xyzhong...@163.com> wrote: > >> Hi, Weiqing. >> >> After reading the new FLIP, I have no issues with the part `public >> interface`. I only have some questions regarding >> >> the details in the Proposed Changes section. >> >> Regarding the ModifyKind and UpdateKind of the IntervalJoin node, IIUC: >> >> - When early firing is enabled, the UpdateKind of the IntervalJoin can be >> either ONLY_UPDATE_AFTER or >> >> degrade to BEFORE_AND_AFTER, depending entirely on the requirements of the >> sink. And the ModifyKind is always ALL. >> >> - When early firing is disabled, the UpdateKind of the IntervalJoin is >> NONE, and the ModifyKind is INSERT. >> >> - Nevertheless, whether early firing is enabled or disabled, the >> IntervalJoin should always require its input to keep >> >> ModifyKind with INSERT_ONLY and UpdateKind with NONE. >> >> >> >> >> -- >> >> Best! >> Xuyang >> >> >> >> >> >> At 2025-01-09 15:30:44, "Weiqing Yang" <yangweiqing...@gmail.com> wrote: >> >Hi Xingcan and Xuyang, >> > >> >Thanks so much for the feedback - it was very helpful! >> > >> >*> 1. The current output stream of a time interval outer join is an >> >append-only stream. This change will make it a potential retractable >> >stream. I'm not sure if the planner supports a dynamic output type like >> >that. Could you add this part to your design doc?* >> > >> > >> > - Yes, enabling early firing on time interval outer joins can emit >> > retractions when previously emitted rows are updated or invalidated by >> > later matches. I’ve updated the proposal (Planner Awareness >> > < >> https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.y5w17oloacws >> > >> > and Changes in FlinkChangelogModeInferenceProgram >> > < >> https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.z6qdwrvtgn4u >> >) >> > to clarify that the stream might switch from append-only to a >> > retract/upsert stream. Let me know if anything is missing. >> > >> > >> >*> 2. What's the use case when the downstream components need to get the >> >early fired results regularly?* >> > >> > >> > - The new INTERVAL option (in addition to DELAY) allows periodic >> updates >> > (e.g., every 10 minutes) after the initial delay. This captures how >> results >> > evolve over time, similar to Apache Beam’s “Repeatedly” option. >> > >> > >> >*> 3. The time interval join operator itself is not quite efficient when >> >the state becomes large. Will there be any extra overhead after >> introducing >> >this feature?* >> > >> > - Early fire does introduce some overhead by potentially emitting >> > partial matches multiple times with retraction (avoiding duplicate >> outputs >> > though). However, if it’s disabled, there is no additional cost. Most >> users >> > find the performance trade-off acceptable for the real-time insights it >> > provides. >> > >> > >> >*> 1. Currently, there are some configs related to early firing available >> >to users: `table.exec.emit.early-fire.en**abled` and >> >`table.exec.emit.early-fire.de <http://table.exec.emit.early-fire.de >> >**lay`. >> >Although their documentation states that they are only applicable to the >> >Window operator, it seems possible that they may also be relevant in the >> >context of this FLIP. Otherwise, having different early firing behaviors >> >for different operators could confuse users.* >> > >> > - +1 on unifying early-fire behaviors to avoid confusion. I’ve added a >> > section >> > < >> https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.rr0i3gmdjt4q >> > >> >in >> > the proposal highlighting that we should align hint-based interval join >> > configurations with the existing table.exec.emit.* settings. >> Suggestions >> > on how to make the unification are welcome! We plan to extend early >> firing >> > to window joins via hints in a future FLIP. >> > >> > >> >*> 2. The design of `time_mode` is excellent. Similar to point 1, perhaps >> >we can introduce it to other window-related operators in the future.> 3. >> >You need to modify the FlinkChangelogModeInferenceProgram to ensure that >> >downstream nodes of interval joins with early firing enabled are aware of >> >retract or upsert messages.* >> > >> > - We agree that time_mode could be introduced to other window-based >> > operators down the road. We also want to support early fire for >> > window join. Also, thanks for highlighting >> > FlinkChangelogModeInferenceProgram! I added the code change on it >> > < >> https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.z6qdwrvtgn4u >> > >> > in the proposal. >> > >> > >> >Thanks again for your time and feedback! I’ve updated the proposal with >> >these points. Please let me know if there’s anything else I should >> address. >> > >> >Best, >> >Weiqing >> > >> > >> >On Mon, Jan 6, 2025 at 6:32 PM Xuyang <xyzhong...@163.com> wrote: >> > >> >> Hi, Weiqing. Thank you for drafting this FLIP. I have a few questions: >> >> >> >> 1. Currently, there are some configs related to early firing available >> to >> >> users: `table.exec.emit.early-fire.enabled` and >> >> >> >> `table.exec.emit.early-fire.delay`. Although their documentation states >> >> that they are only applicable to the Window operator, >> >> >> >> it seems possible that they may also be relevant in the context of this >> >> FLIP. Otherwise, having different early firing behaviors >> >> >> >> for different operators could confuse users. >> >> >> >> 2. The design of `time_mode` is excellent. Similar to point 1, perhaps >> we >> >> can introduce it to other window-related operators >> >> >> >> in the future. >> >> >> >> 3. You need to modify the FlinkChangelogModeInferenceProgram to ensure >> >> that downstream nodes of interval joins with >> >> >> >> early firing enabled are aware of retract or upsert messages. >> >> >> >> >> >> >> >> >> >> -- >> >> >> >> Best! >> >> Xuyang >> >> >> >> >> >> >> >> >> >> >> >> At 2025-01-07 06:35:51, "Xingcan Cui" <xingc...@gmail.com> wrote: >> >> >Hi Weiqing, >> >> > >> >> >Thanks for the proposal. IMO, adding early fire for time interval outer >> >> >joins is feasible overall. I have a few questions. >> >> > >> >> >1. The current output stream of a time interval outer join is an >> >> >append-only stream. This change will make it a potential retractable >> >> >stream. I'm not sure if the planner supports a dynamic output type like >> >> >that. Could you add this part to your design doc? >> >> >2. What's the use case when the downstream components need to get the >> >> early >> >> >fired results regularly? >> >> >3. The time interval join operator itself is not quite efficient when >> the >> >> >state becomes large. Will there be any extra overhead after introducing >> >> >this feature? >> >> > >> >> >Thanks, >> >> >Xingcan >> >> > >> >> >On Mon, Jan 6, 2025 at 4:11 PM Weiqing Yang <yangweiqing...@gmail.com> >> >> >wrote: >> >> > >> >> >> Hi all, >> >> >> >> >> >> Just a gentle reminder regarding the proposal I shared on early fire >> >> >> support for Flink SQL interval joins. I’d greatly appreciate your >> >> feedback >> >> >> or suggestions. >> >> >> >> >> >> Here’s the link to the proposal document: Link >> >> >> < >> >> >> >> >> >> https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.z7bl0h2hwkph >> >> >> > >> >> >> >> >> >> Thank you! >> >> >> >> >> >> Best, >> >> >> Weiqing >> >> >> >> >> >> On Sun, Dec 22, 2024 at 11:19 PM Weiqing Yang < >> yangweiqing...@gmail.com >> >> > >> >> >> wrote: >> >> >> >> >> >> > Hi all, >> >> >> > >> >> >> > I’d like to initiate a discussion about introducing early fire >> support >> >> >> for >> >> >> > Flink SQL interval joins. >> >> >> > >> >> >> > *Motivation* >> >> >> > In many streaming applications, particularly real-time analytics >> and >> >> >> > monitoring systems, it is valuable to obtain partial results >> earlier >> >> >> rather >> >> >> > than waiting for full join conditions to finalize. For Flink SQL >> >> interval >> >> >> > joins, results are typically delayed until watermarks ensure no >> more >> >> >> > matches can occur. This delay can be challenging for scenarios that >> >> >> require >> >> >> > fast feedback. Early fire support addresses this by emitting >> >> intermediate >> >> >> > results speculatively and using retractions or updates to maintain >> >> >> eventual >> >> >> > consistency and ensure correctness. >> >> >> > >> >> >> > Here’s the proposal document: Link >> >> >> > < >> >> >> >> >> >> https://docs.google.com/document/d/1YobpNdnvzSsceniVj4NZWi445gb1-54Rox-D7nPArZo/edit?tab=t.0#heading=h.z7bl0h2hwkph >> >> >> > >> >> >> > >> >> >> > Your feedback and ideas are welcome to refine this feature. >> >> >> > >> >> >> > Thanks, >> >> >> > Weiqing >> >> >> > >> >> >> >> >> >>