Hi all, Thank you for the valuable input.
Based on the current discussion, the minibatch join is prepared to follow the existing three options of 'table.exec.mini-batch.enabled’, 'table.exec.mini-batch.allow-latency' and 'table.exec.mini-batch.size’. As for the compaction within the minibatch that was mentioned in the discussion, it could be discussed in a future FLIP. Do any of you have further questions regarding this FLIP? If there are no more comments, I would like to open a voting thread at 12 a.m. UTC+8 on January 19th. > 2024年1月10日 21:23,shuai xu <xushuai...@gmail.com> 写道: > > Hi devs, > > I’d like to start a discussion on FLIP-415: Introduce a new join operator to > support minibatch[1]. > > Currently, when performing cascading connections in Flink, there is a pain > point of record amplification. Every record join operator receives would > trigger join process. However, if records of +I and -D matches , they could > be folded to reduce two times of join process. Besides, records of -U +U > might output 4 records in which two records are redundant when encountering > outer join . > > To address this issue, this FLIP introduces a new > MiniBatchStreamingJoinOperator to achieve batch processing which could reduce > number of outputting redundant messages and avoid unnecessary join processes. > A new option is added to control the operator to avoid influencing existing > jobs. > > Please find more details in the FLIP wiki document [1]. Looking > forward to your feedback. > > [1] > https://cwiki.apache.org/confluence/display/FLINK/FLIP-415%3A+Introduce+a+new+join+operator+to+support+minibatch > > Best, > Xu Shuai Best, Xu Shuai