subject:"\[DISCUSS\] FLIP\-221 Abstraction for lookup source cache and metric"

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-06-22 Thread Alexander Smirnov

Hi Qingsheng, I like the current design, thanks for your efforts! I have no objections at the moment. +1 to start the vote. Best regards, Alexander ср, 22 июн. 2022 г. в 15:59, Qingsheng Ren : > > Hi Jingsong, > > 1. Updated and thanks for the reminder! > > 2. We could do so for implementation b

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-06-22 Thread Jingsong Li

Qingsheng Thanks for the update, Looks good to me! Best, Jingsong On Wed, Jun 22, 2022 at 5:00 PM Qingsheng Ren wrote: > > Hi Jingsong, > > 1. Updated and thanks for the reminder! > > 2. We could do so for implementation but as public interface I prefer not to > introduce another layer and exp

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-06-22 Thread Qingsheng Ren

Hi Jingsong, 1. Updated and thanks for the reminder! 2. We could do so for implementation but as public interface I prefer not to introduce another layer and expose too much since this FLIP is already a huge one with bunch of classes and interfaces. Best, Qingsheng > On Jun 22, 2022, at 11:16

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-06-21 Thread Jingsong Li

Thanks Qingsheng and all. I like this design. Some comments: 1. LookupCache implements Serializable? 2. Minor: After FLIP-234 [1], there should be many connectors that implement both PartialCachingLookupProvider and PartialCachingAsyncLookupProvider. Can we extract a common interface for `Looku

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-06-21 Thread Qingsheng Ren

Hi devs, I’d like to push FLIP-221 forward a little bit. Recently we had some offline discussions and updated the FLIP. Here’s the diff compared to the previous version: 1. (Async)LookupFunctionProvider is designed as a base interface for constructing lookup functions. 2. From the LookupFuncti

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-06-02 Thread Qingsheng Ren

Hi Becket, Thanks for your feedback! 1. An alternative way is to let the implementation of cache to decide whether to store a missing key in the cache instead of the framework. This sounds more reasonable and makes the LookupProvider interface cleaner. I can update the FLIP and clarify in the Jav

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-06-01 Thread Becket Qin

Thanks for updating the FLIP, Qingsheng. A few more comments: 1. I am still not sure about what is the use case for cacheMissingKey(). More specifically, when would users want to have getCache() return a non-empty value and cacheMissingKey() returns false? 2. The builder pattern. Usually the buil

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-06-01 Thread Jark Wu

Thank Qingsheng for the detailed summary and updates, The changes look good to me in general. I just have one minor improvement comment. Could we add a static util method to the "FullCachingReloadTrigger" interface for quick usage? #periodicReloadAtFixedRate(Duration) #periodicReloadWithFixedDela

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-06-01 Thread Qingsheng Ren

Hi Jingsong, Thanks for your comments! > AllCache definition is not flexible, for example, PartialCache can use any > custom storage, while the AllCache can not, AllCache can also be considered > to store memory or disk, also need a flexible strategy. We had an offline discussion with Jark an

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-06-01 Thread Qingsheng Ren

Hi Becket, Thanks for your comments! 1. We have removed the LookupCacheFactory in the latest design and we added open/close method to the LookupCache for initialization. 2. Custom reload strategy is a great idea! We added a new interface FullCachingReloadTrigger for developers to implement t

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-06-01 Thread Qingsheng Ren

Hi Alexander, Thanks for the update! I made some updates on the FLIP and here’s some diffs compared to your version: 1. ReloadStrategy has been covered by the new interface FullCachingReloadTrigger. This trigger will be taken over and run by the LookupJoinRunner and provides enough flexibilit

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-06-01 Thread Qingsheng Ren

Hi devs, Thanks for the in-depth discussion! We recently update some designs of FLIP-221 [1]: 1. Introduce a new interface “FullCachingReloadTrigger” for developer to customize the reload strategy. The previous design was only time-based and not flexable enough. Developers can implement any lo

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-06-01 Thread Jingsong Li

Thanks Alexander for your reply. We can discuss the new interface when it comes out. We are more inclined to deprecate the connector `async` option when discussing FLIP-234 [1]. We should use hint to let planner decide. Although the discussion has not yet produced a conclusion, can we remove this

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-31 Thread Jing Ge

Hi Jark, Thanks for clarifying it. It would be fine. as long as we could provide the no-cache solution. I was just wondering if the client side cache could really help when HBase is used, since the data to look up should be huge. Depending how much data will be cached on the client side, the data

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-30 Thread Alexander Smirnov

Hi Jingsong and devs! I agree that custom reloading would be very useful, so I changed recently proposed ReloadTime to customizable ReloadStrategy and its default realization FixedDelayReloadStrategy. I updated the FLIP, you can look at the new design [1]. From my point of view, the disadvantage o

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-30 Thread Jark Wu

Hi Jing Ge, What do you mean about the "impact on the block cache used by HBase"? In my understanding, the connector cache and HBase cache are totally two things. The connector cache is a local/client cache, and the HBase cache is a server cache. > does it make sense to have a no-cache solution a

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-27 Thread Jing Ge

Thanks all for the valuable discussion. The new feature looks very interesting. According to the FLIP description: "*Currently we have JDBC, Hive and HBase connector implemented lookup table source. All existing implementations will be migrated to the current design and the migration will be trans

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-27 Thread Jingsong Li

Hi all, I think the problem now is below: 1. AllCache and PartialCache interface on the non-uniform, one needs to provide LookupProvider, the other needs to provide CacheBuilder. 2. AllCache definition is not flexible, for example, PartialCache can use any custom storage, while the AllCache can no

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-26 Thread Jingsong Li

Thanks Qingsheng and all for your discussion. Very sorry to jump in so late. Maybe I missed something? My first impression when I saw the cache interface was, why don't we provide an interface similar to guava cache [1], on top of guava cache, caffeine also makes extensions for asynchronous calls

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-26 Thread Jark Wu

After looking at the new introduced ReloadTime and Becket's comment, I agree with Becket we should have a pluggable reloading strategy. We can provide some common implementations, e.g., periodic reloading, and daily reloading. But there definitely be some connector- or business-specific reloading s

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-25 Thread Becket Qin

Hi Qingsheng, Thanks for updating the FLIP. A few comments / questions below: 1. Is there a reason that we have both "XXXFactory" and "XXXProvider". What is the difference between them? If they are the same, can we just use XXXFactory everywhere? 2. Regarding the FullCachingLookupProvider, shoul

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-25 Thread Alexander Smirnov

Hi Qingsheng and devs, I like the new concept of partial / full caching. Thanks for your efforts, Qingsheng! > The schedule-with-delay idea looks reasonable to me, but I think we need to > redesign the builder API of full caching to make it more descriptive > for > developers. I've got the acc

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-25 Thread Qingsheng Ren

Hi Lincoln and Jark, Thanks for the comments! If the community reaches a consensus that we use SQL hint instead of table options to decide whether to use sync or async mode, it’s indeed not necessary to introduce the “lookup.async” option. I think it’s a good idea to let the decision of async

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-25 Thread Lincoln Lee

Hi Jark, Thanks for your reply! Currently 'lookup.async' just lies in HBase connector, I have no idea whether or when to remove it (we can discuss it in another issue for the HBase connector after FLINK-27625 is done), just not add it into a common option now. Best, Lincoln Lee Jark Wu 于2022年

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-24 Thread Jark Wu

Hi Lincoln, I have taken a look at FLIP-234, and I agree with you that the connectors can provide both async and sync runtime providers simultaneously instead of one of them. At that point, "lookup.async" looks redundant. If this option is planned to be removed in the long term, I think it makes s

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-23 Thread Lincoln Lee

Hi Qingsheng, Sorry for jumping into the discussion so late. It's a good idea that we can have a common table option. I have a minor comments on 'lookup.async' that not make it a common option: The table layer abstracts both sync and async lookup capabilities, connectors implementers can choose

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-22 Thread Qingsheng Ren

Hi Alexander, Thanks for the review! We recently updated the FLIP and you can find those changes from my latest email. Since some terminologies has changed so I’ll use the new concept for replying your comments. 1. Builder vs ‘of’ I’m OK to use builder pattern if we have additional optional pa

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-22 Thread Qingsheng Ren

Hi devs, We recently updated FLIP-221 [1] in order to make the concept of caching clearer and introduce some new common lookup table options. The changes are listed below: 1. "LRU cache" and “all cache” have been renamed as “partial cache” and “full cache”. It is not necessary for the cache to

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-19 Thread Александр Смирнов

Also I have a few additions: 1) maybe rename 'lookup.cache.maximum-size' to 'lookup.cache.max-rows'? I think it will be more clear that we talk not about bytes, but about the number of rows. Plus it fits more, considering my optimization with filters. 2) How will users enable rescanning? Are we goi

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-19 Thread Александр Смирнов

Hi Qingsheng and Jark, 1. Builders vs 'of' I understand that builders are used when we have multiple parameters. I suggested them because we could add parameters later. To prevent Builder for ScanRuntimeProvider from looking redundant I can suggest one more config now - "rescanStartTime". It's a t

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-18 Thread Leonard Xu

Thanks Qingsheng and Alexander for the update. Current API and Options design of this FLIP look good enough from my side,. If no more concerns about the thread, I think we can start a VOTE thread later. Best, Leonard > 2022年5月18日下午5:04，Qingsheng Ren 写道： > > Hi Jark and Alexander, > > Thanks

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-18 Thread Qingsheng Ren

Hi Jark and Alexander, Thanks for your comments! I’m also OK to introduce common table options. I prefer to introduce a new DefaultLookupCacheOptions class for holding these option definitions because putting all options into FactoryUtil would make it a bit ”crowded” and not well categorized. FLI

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-17 Thread Jark Wu

Hi Alex, 1) retry logic I think we can extract some common retry logic into utilities, e.g. RetryUtils#tryTimes(times, call). This seems independent of this FLIP and can be reused by DataStream users. Maybe we can open an issue to discuss this and where to put it. 2) cache ConfigOptions I'm fine

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-17 Thread Александр Смирнов

Hi Qingsheng, Thank you for considering my comments. > there might be custom logic before making retry, such as re-establish the > connection Yes, I understand that. I meant that such logic can be placed in a separate function, that can be implemented by connectors. Just moving the retry logic

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-17 Thread Jark Wu

Thank Qingsheng for updating the design. I just have a minor comment. Changing RescanRuntimeProvider into the builder pattern seems not necessary. The builder pattern is usually used when there are several optional parameters. However, ScanRuntimeProvider and rescan interval are both required when

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-17 Thread Qingsheng Ren

Hi devs, I just updated FLIP-221 [1] according to discussions addressed above. This version made minor changes to make interfaces clearer: 1. The argument type of LookupFunction#lookup and AsyncLookupFunction#asyncLookup are changed from Object... to RowData, in order to be symmetric with out

Re:Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-17 Thread zst...@163.com

Hi Qingsheng, Alexander, Thanks for your reply. > Can you give an example of such upper - level Cache usage? It's not clear for > me currently. I think it's unnecessary to have such high level abstraction, > if nowhere in the code we won't operate with objects as instances of Cache. > But may

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-17 Thread Qingsheng Ren

Hi Yuan, Thanks for the review! Basically I’m with Alexander opinion. We’d like to limit the scope in lookup scenario so we didn’t extend the cache to a generic one. And as for the metric I think the existing metric definitions are also applicable for all-cache case. Cheers, Qingsheng > O

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-17 Thread Qingsheng Ren

Hi Alexander, Thanks for the review and glad to see we are on the same page! I think you forgot to cc the dev mailing list so I’m also quoting your reply under this email. > We can add 'maxRetryTimes' option into this class In my opinion the retry logic should be implemented in lookup() ins

Re: Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-16 Thread Александр Смирнов

Hi Yuan! > How about abtract the LookupCache to a higher level with a common Cache? Can you give an example of such upper - level Cache usage? It's not clear for me currently. I think it's unnecessary to have such high level abstraction, if nowhere in the code we won't operate with objects as ins

Re:Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-15 Thread zst...@163.com

Hi Qingsheng and devs, Thanks for your heated discussion and redesign to optmize this feature. I just have two comments: 1. How about abtract the LookupCache to a higher level with a common Cache? It will be convenient for devs to use in other place. 2. Does it have any metrics, such as

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-13 Thread Qingsheng Ren

Hi Alexander and devs, Thank you very much for the in-depth discussion! As Jark mentioned we were inspired by Alexander's idea and made a refactor on our design. FLIP-221 [1] has been updated to reflect our design now and we are happy to hear more suggestions from you! Compared to the previous de

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-12 Thread Александр Смирнов

Hi Jark, Qingsheng and Leonard! Glad to see that we came to a consensus on almost all points! However I'm a little confused whether InputFormat is deprecated or not. Am I right that it will be so in the future, but currently it's not? Actually I also think that for the first version it's OK to us

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-12 Thread Jark Wu

Hi Alex, Thanks for summarizing your points. In the past week, Qingsheng, Leonard, and I have discussed it several times and we have totally refactored the design. I'm glad to say we have reached a consensus on many of your points! Qingsheng is still working on updating the design docs and maybe

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-12 Thread Александр Смирнов

Hi Martijn! Got it. Therefore, the realization with InputFormat is not considered. Thanks for clearing that up! Best regards, Smirnov Alexander чт, 12 мая 2022 г. в 14:23, Martijn Visser : > > Hi, > > With regards to: > > > But if there are plans to refactor all connectors to FLIP-27 > > Yes, FL

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-12 Thread Martijn Visser

Hi, With regards to: > But if there are plans to refactor all connectors to FLIP-27 Yes, FLIP-27 is the target for all connectors. The old interfaces will be deprecated and connectors will either be refactored to use the new ones or dropped. The caching should work for connectors that are using

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-11 Thread Александр Смирнов

Hi Jark! Sorry for the late response. I would like to make some comments and clarify my points. 1) I agree with your first statement. I think we can achieve both advantages this way: put the Cache interface in flink-table-common, but have implementations of it in flink-table-runtime. Therefore if

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-05 Thread Jark Wu

It's great to see the active discussion! I want to share my ideas: 1) implement the cache in framework vs. connectors base I don't have a strong opinion on this. Both ways should work (e.g., cache pruning, compatibility). The framework way can provide more concise interfaces. The connector base wa

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-05 Thread Roman Boyko

It's a much more complicated activity and lies out of the scope of this improvement. Because such pushdowns should be done for all ScanTableSource implementations (not only for Lookup ones). On Thu, 5 May 2022 at 19:02, Martijn Visser wrote: > Hi everyone, > > One question regarding "And Alexand

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-05 Thread Martijn Visser

Hi everyone, One question regarding "And Alexander correctly mentioned that filter pushdown still is not implemented for jdbc/hive/hbase." -> Would an alternative solution be to actually implement these filter pushdowns? I can imagine that there are many more benefits to doing that, outside of loo

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-05 Thread Roman Boyko

Hi everyone! Thanks for driving such a valuable improvement! I do think that single cache implementation would be a nice opportunity for users. And it will break the "FOR SYSTEM_TIME AS OF proc_time" semantics anyway - doesn't matter how it will be implemented. Putting myself in the user's shoes

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-04 Thread Martijn Visser

Hi everyone, I don't have much to chip in, but just wanted to express that I really appreciate the in-depth discussion on this topic and I hope that others will join the conversation. Best regards, Martijn On Tue, 3 May 2022 at 10:15, Александр Смирнов wrote: > Hi Qingsheng, Leonard and Jark,

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-03 Thread Александр Смирнов

Hi Qingsheng, Leonard and Jark, Thanks for your detailed feedback! However, I have questions about some of your statements (maybe I didn't get something?). > Caching actually breaks the semantic of "FOR SYSTEM_TIME AS OF proc_time” I agree that the semantics of "FOR SYSTEM_TIME AS OF proc_time"

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-04-29 Thread Qingsheng Ren

Hi Alexander and Arvid, Thanks for the discussion and sorry for my late response! We had an internal discussion together with Jark and Leonard and I’d like to summarize our ideas. Instead of implementing the cache logic in the table runtime layer or wrapping around the user-provided table function

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-04-26 Thread Александр Смирнов

Thanks for the response, Arvid! I have few comments on your message. > but could also live with an easier solution as the first step: I think that these 2 ways are mutually exclusive (originally proposed by Qingsheng and mine), because conceptually they follow the same goal, but implementation d

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-04-25 Thread Arvid Heise

Hi Qingsheng, Thanks for driving this; the inconsistency was not satisfying for me. I second Alexander's idea though but could also live with an easier solution as the first step: Instead of making caching an implementation detail of TableFunction X, rather devise a caching layer around X. So the

RE: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-04-22 Thread Александр Смирнов

Hi Qingsheng! My name is Alexander, I'm not a committer yet, but I'd really like to become one. And this FLIP really interested me. Actually I have worked on a similar feature in my company’s Flink fork, and we would like to share our thoughts on this and make code open source. I think there is a

[DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-04-18 Thread Qingsheng Ren

Hi devs, Yuan and I would like to start a discussion about FLIP-221[1], which introduces an abstraction of lookup table cache and its standard metrics. Currently each lookup table source should implement their own cache to store lookup results, and there isn’t a standard of metrics for users a

58 matches

Mail list logo