Dear Dong,

I have thoroughly reviewed the proposal for FLIP-331 and believe it would be 
a valuable addition to Flink. However, I do have a few questions that I would 
like to discuss:


1. The FLIP-331 proposed the EndOfStreamWindows that is implemented by 
TimeWindow with maxTimestamp = (Long.MAX_VALUE - 1), which naturally 
supports WindowedStream and AllWindowedStream to process all records 
belonging to a key in a 'global' window under both STREAMING and BATCH 
runtime execution mode. 


However, besides coGroup and keyBy().aggregate(), other operators on 
WindowedStream and AllWindowedStream, such as join/reduce, etc, currently 
are still implemented based on WindowOperator.


In fact, these operators can also be implemented without using WindowOperator 
to prevent additional WindowAssigner#assignWindows or triggerContext#onElement 
invocation cost. Will there be plans to support these operators in the future?


2. When using EndOfStreamWindows, upstream operators no longer support 
checkpointing. This limit may be too strict, especially when dealing with 
bounded data in streaming runtime execution mode, where checkpointing 
can still be useful.

3. The proposal mentions that if a transformation has isOutputOnEOF == true, 
the 
operator as well as its upstream operators will be executed in 'batch mode' 
with 
checkpointing disabled. I would like to understand the specific implications of 
this
'batch mode' and if there are any other changes associated with it? 

Additionally, I am curious to know if this 'batch mode' conflicts with the 'mix 
mode'

described in FLIP-327. While the coGroup and keyBy().aggregate() operators on 
EndOfStreamWindows have the attribute 'isInternalSorterSupported' set to true, 
indicating support for the 'mixed mode', they also have isOutputOnEOF set to 
true, 
which suggests that the upstream operators should be executed in 'batch mode'. 
Will the 'mixed mode' be ignored when in 'batch mode'? I would appreciate any 
clarification on this matter.

Thank you for taking the time to consider my feedback. I eagerly await your 
response.

Best regards,

Wencong Liu











At 2023-09-01 11:21:47, "Dong Lin" <lindon...@gmail.com> wrote:
>Hi all,
>
>Jinhao (cc'ed) and I are opening this thread to discuss FLIP-331: Support
>EndOfStreamWindows and isOutputOnEOF operator attribute to optimize task
>deployment. The design doc can be found at
>https://cwiki.apache.org/confluence/display/FLINK/FLIP-331%3A+Support+EndOfStreamWindows++and+isOutputOnEOF+operator+attribute+to+optimize+task+deployment
>.
>
>This FLIP introduces isOutputOnEOF operator attribute that JobManager can
>use to optimize task deployment and resource utilization. In addition, it
>also adds EndOfStreamWindows that can be used with the DataStream APIs
>(e.g. cogroup, aggregate) to significantly increase throughput and reduce
>resource utilization.
>
>We would greatly appreciate any comment or feedback you may have on this
>proposal.
>
>Cheers,
>Dong

Reply via email to