Hi everyone, I've updated the FLIP[1] based on this discussion thread that we agree to have a single unified 'LOOKUP' hint and also a related part in FLIP-221[2] which is mainly for the necessity of the common table option 'lookup.async'.
The main updates are: 1. the new unified 'LOOKUP' hint, make retry support both on sync and async lookup 2. clarify the default choice of the planner for those connectors which have both sync and async lookup capabilities, and how to deal with the query hint 3. will add a followup issue to discuss whether to remove the 'lookup.async' option in HBase connector. [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-234%3A+Support+Retryable+Lookup+Join+To+Solve+Delayed+Updates+Issue+In+External+Systems [2] https://lists.apache.org/thread/1vokqdnnt01yycl7y1p74g556cc8yvtq Best, Lincoln Lee Lincoln Lee <lincoln.8...@gmail.com> 于2022年6月1日周三 16:03写道: > Hi Jingsong, > > There will be no change for connectors with only one capability (sync or > async). > > Query hint works in a best effort manner, so if users specifies a hint > with invalid option, the query plan keeps unchanged, e.g., use > LOOKUP('table'='customer', 'async'='true'), but backend lookup source only > implemented the sync lookup function, then the async lookup hint takes no > effect. > > For these connectors which can have both capabilities of async and sync > lookup, our advice for the connector developer is implementing both sync > and async interfaces if both capabilities have suitable use cases, and the > planner can decide which capability is the preferable one based on cost > model or maybe other mechanism (another use case is exactly what we're > discussing here, users can give the query hint), otherwise choose one > interface to implement. > > Also, this should be clarified for the lookup function related APIs. > > Best, > Lincoln Lee > > > Jingsong Li <jingsongl...@gmail.com> 于2022年6月1日周三 15:18写道: > >> Hi Lincoln, >> >> > It's better making decisions at the query level when a connector has >> both >> capabilities. >> >> Can you clarify the mechanism? >> - only sync connector: What connector developers should do >> - only async connector: What connector developers should do >> - both async and sync connector: What connector developers should do >> >> Best, >> Jingsong >> >> On Wed, Jun 1, 2022 at 2:29 PM Lincoln Lee <lincoln.8...@gmail.com> >> wrote: >> >> > Hi Jingsong, >> > >> > Thanks for your feedback! >> > >> > Yes, the existing HBase connector use an option 'lookup.async' to >> control >> > its lookup source implementations that exposed to the planner, however >> it's >> > a private option for the HBase connector only, so it will not affect the >> > common API. >> > >> > And as discussed in the mailing thread of FLIP-221[1], we got a >> consensus >> > that do not make it as a common option. It's better making decisions at >> the >> > query level when a connector has both capabilities. >> > >> > So if everything goes well, we should discuss it whether to deprecate >> > the 'lookup.async' >> > or not for HBase connector after the hint been done. >> > >> > This will be mentioned in the Compatibility part of this FLIP[2]. >> > >> > WDYT? >> > >> > [1]: https://lists.apache.org/thread/v76g8v1o9sjdho9kbzlgjyv38l2oynox >> > [2]: >> > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-234%3A+Support+Retryable+Lookup+Join+To+Solve+Delayed+Updates+Issue+In+External+Systems?src=contextnavpagetreemode >> > >> > Best, >> > Lincoln Lee >> > >> > >> > Jingsong Li <jingsongl...@gmail.com> 于2022年6月1日周三 14:11写道: >> > >> > > Hi Lincoln, >> > > >> > > The unified lookup hint is what I want. >> > > >> > > And I like 'async'='true|false' option. >> > > >> > > But there is a compatibility issue, as I remember if async is >> currently >> > > controlled by connector, and this may also require some API changes? >> > > >> > > We need to have a clear story for the connector combined with this >> > option: >> > > - only sync connector >> > > - only async connector >> > > - both async and sync connector >> > > >> > > Best, >> > > Jingsong >> > > >> > > On Mon, May 30, 2022 at 3:55 PM Lincoln Lee <lincoln.8...@gmail.com> >> > > wrote: >> > > >> > > > Thanks Jark for your quick response and the consensus! >> > > > >> > > > And I will update the FLIP after Jingsong or other developers >> confirm >> > > that >> > > > there is no problem. >> > > > >> > > > Best, >> > > > Lincoln Lee >> > > > >> > > > >> > > > Jark Wu <imj...@gmail.com> 于2022年5月30日周一 15:49写道: >> > > > >> > > > > Thanks for the update. >> > > > > >> > > > > The unified lookup hint looks good to me. >> > > > > And thanks for explaining ALLOW_UNORDERED. >> > > > > >> > > > > Best, >> > > > > Jark >> > > > > >> > > > > On Mon, 30 May 2022 at 15:31, Lincoln Lee <lincoln.8...@gmail.com >> > >> > > > wrote: >> > > > > >> > > > > > Hi Jark & Jingsong, >> > > > > > >> > > > > > Thanks for your feedback! >> > > > > > >> > > > > > 1.) support retry on sync lookup >> > > > > > I also agree with supporting it, this will be useful for >> connectors >> > > > that >> > > > > > don't have asynchronous lookup implementations and can also >> solve >> > the >> > > > > ASYNC >> > > > > > non-target problem to some extent(because the retrying is >> blocking >> > > for >> > > > > sync >> > > > > > lookup, and may accumulate delay, but it maybe acceptable for >> the >> > > case >> > > > > that >> > > > > > most or all data want to do a delayed lookup). >> > > > > > >> > > > > > For the api perspective, we can do some unification. Let's >> think of >> > > the >> > > > > > whole user story for lookup join: >> > > > > > >> > > > > > ASYNC_LOOKUP vs SYNC_LOOKUP can share a common one: LOOKUP by >> > > > different >> > > > > > hint option values: 'async'='true|false' >> > > > > > >> > > > > > ASYNC_LOOKUP_MISS_RETRY vs SYNC_LOOKUP_MISS_RETRY can share the >> > > > > > LOOKUP_MISS_RETRY with hint option: 'miss-retry'='true|false' >> > > > > > >> > > > > > we can use one single hint LOOKUP with different hint options >> > > > > > ('async'='true|false', 'miss-retry'='true|false') to cover all >> > > related >> > > > > > functionalities. >> > > > > > Compared to multiple hints with different subsets of >> > functionality, a >> > > > > > single hint may be easier for users to understand and use, and >> > > specific >> > > > > > parameters can be quickly found through documentation >> > > > > > >> > > > > > the support matrix will be: >> > > > > > lookup support async retry >> > > > > > sync w/o retry N N >> > > > > > sync w/ retry N Y >> > > > > > async w/o retry Y N >> > > > > > async w/ retry Y Y >> > > > > > >> > > > > > and the available hint options for each mode: >> > > > > > mode support hint options >> > > > > > async async'='true' >> > > > > > 'output-mode'='ordere|allow-unordered' >> > > > > > 'capacity'='100' >> > > > > > 'timeout'='180s' >> > > > > > retry miss-retry'='true' >> > > > > > 'retry-strategy'='fixed-delay' >> > > > > > 'delay'='10s' >> > > > > > 'max-attempts'='3' >> > > > > > >> > > > > > >> > > > > > 2.) 'allow-unordered' vs 'unordered' for >> > > > > > 'table.exec.async-lookup.output-mode' >> > > > > > Yes, make it align with DataStream Api maybe more intuitive, but >> > > > there's >> > > > > > some difference in table layer that makes the 'allow-unordered' >> > > > > meaningful: >> > > > > > updates in the pipeline need to be processed in order, >> > > ALLOW_UNORDERED >> > > > > > means if users allow unordered result, it will attempt to use >> > > > > > AsyncDataStream.OutputMode.UNORDERED when it does not affect the >> > > > > > correctness of the result, otherwise ORDERED will be still used. >> > > > > > >> > > > > > Another choice is that when the user specifies unordered mode, >> > > planner >> > > > > > throws an error when it finds that it may affect correctness. >> But >> > > this >> > > > is >> > > > > > not user-friendly and is not consistent with the customary >> > treatment >> > > of >> > > > > > invalid query hints(best effort). >> > > > > > >> > > > > > I opened a pr https://github.com/apache/flink/pull/19759 for >> the >> > new >> > > > > > option >> > > > > > 'table.exec.async-lookup.output-mode' and also a discussion on >> > > > > FLINK-27625: >> > > > > > add query hint 'ASYNC_LOOKUP' for async lookup join(Since the >> > changes >> > > > > were >> > > > > > relatively minor, no new flip was created) >> > > > > > >> > > > > > If we can reach a consensus on the single unified hint, e.g., >> > LOOKUP, >> > > > > then >> > > > > > FLINK-27625 can be covered. >> > > > > > >> > > > > > WDYT? >> > > > > > >> > > > > > Best, >> > > > > > Lincoln Lee >> > > > > > >> > > > > > >> > > > > > Jark Wu <imj...@gmail.com> 于2022年5月27日周五 21:04写道: >> > > > > > >> > > > > > > Hi Lincoln, >> > > > > > > >> > > > > > > Delayed Dim Join is a frequently requested feature, it's >> exciting >> > > to >> > > > > see >> > > > > > > this feature is on the road. >> > > > > > > The FLIP looks good to me in general. I only left some minor >> > > > comments. >> > > > > > > >> > > > > > > 1) support retry for sync lookup >> > > > > > > I'm also fine with the idea proposed by Jingsong. But this >> > doesn't >> > > > > > conflict >> > > > > > > with the FLIP and can >> > > > > > > be future work. It would be great if we can determine the APIs >> > > first. >> > > > > > > >> > > > > > > 1) "allow-unordered" => "unordered" >> > > > > > > I would prefer the "unordered" output mode rather than >> > > > > "allow-unordered". >> > > > > > > Because this fully aligns with the DataStream behaviors and >> > avoids >> > > > > > > confusion on the differences. >> > > > > > > I understand the purpose that adding a "allow" prefix here, >> but I >> > > > think >> > > > > > the >> > > > > > > semantic is fine to just >> > > > > > > use "unordered" here. We didn't see any users confused about >> > > > > > > OutputMode#UNORDERED. >> > > > > > > >> > > > > > > Best, >> > > > > > > Jark >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > On Fri, 27 May 2022 at 12:58, Jingsong Li < >> > jingsongl...@gmail.com> >> > > > > > wrote: >> > > > > > > >> > > > > > > > Thanks Lincoln for your proposal. >> > > > > > > > >> > > > > > > > Take a look at `strategy: fixed-delay delay: duration, e.g., >> > 10s >> > > > > > > > max-attempts: integer, e.g., 3`. >> > > > > > > > >> > > > > > > > Are these options only for async? It looks like normal >> lookups >> > > work >> > > > > > too? >> > > > > > > > >> > > > > > > > One thing is: most of the lookup functions seem to be >> > synchronous >> > > > > now? >> > > > > > > > There are not so many asynchronous ones? >> > > > > > > > >> > > > > > > > Best, >> > > > > > > > Jingsong >> > > > > > > > >> > > > > > > > On Tue, May 24, 2022 at 11:48 AM Lincoln Lee < >> > > > lincoln.8...@gmail.com >> > > > > > >> > > > > > > > wrote: >> > > > > > > > >> > > > > > > > > Hi all, >> > > > > > > > > >> > > > > > > > > Considering the new common table option >> 'lookup.max-retries' >> > > > > proposed >> > > > > > > in >> > > > > > > > > FLIP-221[1] which is commonly used for exception handling >> in >> > > > > > connector >> > > > > > > > > implementation, we should clearly distinguish >> > > ASYNC_LOOKUP_RETRY >> > > > > from >> > > > > > > it >> > > > > > > > to >> > > > > > > > > avoid confusing users. >> > > > > > > > > >> > > > > > > > > To do so, the name ASYNC_LOOKUP_RETRY can change to >> > > > > > > > > ASYNC_LOOKUP_MISS_RETRY, and as the name implies, >> restrict >> > it >> > > to >> > > > > > > support >> > > > > > > > > retries only for lookup misses and no longer include >> > exceptions >> > > > > (for >> > > > > > > sql >> > > > > > > > > connectors, let the connector implementer decide how to >> > handle >> > > > > > > exceptions >> > > > > > > > > since there are various kinds of retryable exceptions and >> can >> > > not >> > > > > > retry >> > > > > > > > > ones). >> > > > > > > > > >> > > > > > > > > The FLIP[2] has been updated. >> > > > > > > > > >> > > > > > > > > [1] >> > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-221%3A+Abstraction+for+lookup+source+cache+and+metric >> > > > > > > > > [2] >> > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-234%3A+Support+Retryable+Lookup+Join+To+Solve+Delayed+Updates+Issue+In+External+Systems >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > Best, >> > > > > > > > > Lincoln Lee >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > Lincoln Lee <lincoln.8...@gmail.com> 于2022年5月19日周四 >> 18:24写道: >> > > > > > > > > >> > > > > > > > > > Dear Flink developers, >> > > > > > > > > > >> > > > > > > > > > I would like to open a discussion on FLIP 234 [1] to >> > support >> > > > > > > retryable >> > > > > > > > > > lookup join to solve delayed updates issue, as a >> pre-work >> > for >> > > > > this >> > > > > > > > > > solution, we proposed FLIP-232[2] which adds a generic >> > retry >> > > > > > support >> > > > > > > > for >> > > > > > > > > > Async I/O. >> > > > > > > > > > We prefer to offer this retry capability via query >> hints, >> > > > similar >> > > > > > to >> > > > > > > > new >> > > > > > > > > > join hints proposed in FLINK-27625[3] & FLIP-204[4]. >> > > > > > > > > > >> > > > > > > > > > This feature is backwards compatible and transparently >> to >> > > > > > connectors. >> > > > > > > > For >> > > > > > > > > > existing connectors which implements AsyncTableFunction, >> > can >> > > > > easily >> > > > > > > > > enable >> > > > > > > > > > async retry via the new join hint. >> > > > > > > > > > >> > > > > > > > > > [1] >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-234%3A+Support+Retryable+Lookup+Join+To+Solve+Delayed+Updates+Issue+In+External+Systems >> > > > > > > > > > [2] >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211883963 >> > > > > > > > > > [3] >> > > > > > >> https://lists.apache.org/thread/jm9kg33wk9z2bvo2b0g5bp3n5kfj6qv8 >> > > > > > > > > > [4] >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-204:+Introduce+Hash+Lookup+Join >> > > > > > > > > > >> > > > > > > > > > Best, >> > > > > > > > > > Lincoln Lee >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> >