Hi, Shuai.
This is a valuable addition to the current AsyncLookupJoin, and I’m
generally in favor of it.
I have one question. Why do we need to introduce additional parameters
to control KEY_ORDERED and ALLOW_UNORDERED? In other words,
what scenarios require allowing users to perform completely unordered
async lookup joins in the presence of an upsert key?
--
Best!
Xuyang
在 2025-04-11 10:39:46,"shuai xu" <[email protected]> 写道:
>Hi all,
>
>This FLIP will primarily focus on the implementation within the table module.
>As for support in the DataStream API, it will be addressed in a separate FLIP.
>
>> 2025年4月8日 09:57,shuai xu <[email protected]> 写道:
>>
>> Hi devs,
>>
>> I'd like to start a discussion on FLIP-519: Introduce async lookup key
>> ordered mode[1].
>>
>> The Flink system currently supports both record-level ordered and
>> unordered output modes for asynchronous lookup joins. However, it does
>> not guarantee the processing order of records sharing the same key.
>>
>> As highlighted in [2], there are two key requirements for enhancing
>> async io operations:
>> 1. Ensuring the processing order of records with the same key is a
>> common requirement in DataStream.
>> 2. Sequential processing of records sharing the same upsertKey when
>> performing lookup join in Flink SQL is essential for maintaining
>> correctness.
>>
>> This optimization aims to balance correctness and performance for
>> stateful streaming workloads.Then the FLIP introduce a new operator
>> KeyedAsyncWaitOperator to supports the optimization. Besides, a new
>> option is added to control the behaviour avoid influencing existing
>> jobs.
>>
>> please find more details in the FLIP wiki document[1]. Looking forward
>> to your feedback.
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-519%3A++Introduce+async+lookup+key+ordered+mode
>> [2] https://lists.apache.org/thread/wczzjhw8g0jcbs8lw2jhtrkw858cmx5n
>>
>> Best,
>> Xu Shuai