Thank you for your reply. I am very sorry for the misunderstanding caused by my deviation from the original discussion.
@Shammon > I found there is a pre-discussion [1] for this FLIP Yes, there was indeed such a discussion before. However, in designing the whole solution, I found that the logic of CatalogManager itself doesn't need to change much. *We cannot only persist Catalog instances themselves*, so exposing only registerCatalog(String catalogName, Catalog catalog) might not be enough to save Catalogs, because in the end we still need to save the configurations corresponding to the Catalog instances. Therefore, I decided to introduce the CatalogStore interface for configuration persistence. Regarding this part, I also took inspiration from Trino's implementation[1]. @yuxia > 1: The mechanism of handling catalog with the same name looks a little of complex to me. Thank you for the suggestion. I will provide a detailed description and code examples for this part, and add it to the FLIP. > 2: TBH, the method name `addCatalog` still looks confused to me. IIUC, this method is for storing catalog to CatalogStore, how about renaming it to `storeCatalog`? It's very personal opinion, you can decide to take it or not by your self. StoreCatalog looks more intuitive to me. I don't see any problem with it. > 3.For CREATE CATALOG statement, which method will be called? `registerCatalog` or `addCatalog`? I'm wondering whether user can add a catalog to store with SQL stement. For CREATE CATALOG, my original design was to add it directly to the CatalogStore, but this would disrupt the existing logic. Therefore, I think we can do both: save the configuration to the CatalogStore and initialize the Catalog instance at the same time > 3. Is it really neccessary to provide a default implmentation for interface `CatalogStoreFactory`? I think it is necessary, otherwise we would need to introduce an additional Map to store the configuration for lazy loading. > 4: About asynchronous registration for catalog. I don't think registerCatalog(String catalogName, Catalog catalog) can be made into an asynchronous interface because Catalog is already an instance. > It looks more like lazy initialization for catalog than asynchronous registration, right? Indeed, my description was inaccurate. It should be lazy registration instead of asynchronous registration. I have already updated the title of the FLIP. [1]. https://github.com/trinodb/trino/blob/master/core/trino-main/src/main/java/io/trino/connector/CatalogStore.java On Tue, Apr 4, 2023 at 4:27 PM Shammon FY <zjur...@gmail.com> wrote: > Hi Feng > > I think if there is a `registerCatalog` method in `CatalogManager`, it will > confuse users whether a method named `addCatalog` or `storeCatalog` is > added. > > And as you mentioned, the memory catalog is a `cache`, I think the concept > of `cache` should not be exposed to users. > > I found there is a pre-discussion [1] for this FLIP. Please correct me if > I'm wrong, IIUC, the conclusion of that discussion is to use > `CatalogManager` as an interface and implement it for different stores such > as memory, file and external system. > > I think there is a gap between the current FLIP design and that conclusion. > What about the proposal of the discussion in thread [1] ? > > > [1] https://lists.apache.org/thread/9bnjblgd9wvrl75lkm84oo654c4lqv70 > > > Best, > Shammon FY > > > On Tue, Apr 4, 2023 at 3:41 PM yuxia <luoyu...@alumni.sjtu.edu.cn> wrote: > > > Thanks Feng for driving this FLIP. > > I have few comments: > > 1: The mechanism of handling catalog with the same name looks a little of > > complex to me. I think it'll be better to explain it in the java doc of > > these methods and give a brief example in this FLIP. > > > > 2: TBH, the method name `addCatalog` still looks confused to me. IIUC, > > this method is for storing catalog to CatalogStore, how about renaming it > > to `storeCatalog`? It's very personal opinion, you can decide to take it > or > > not by your self. > > > > 3: For CREATE CATALOG statement, which method will be called? > > `registerCatalog` or `addCatalog`? I'm wondering whether user can add a > > catalog to store with SQL stement. > > > > > > 3: Is it really neccessary to provide a default implmentation for > > interface `CatalogStoreFactory`? > > > > > > 4: About asynchronous registration for catalog. > > > > > When creating a catalog with CREATE CATALOG, the asynchronous > > registration method is used by default. > > If asynchronous registration is the default behavior, it there any way > > that user can switch to synchronous registration just like before? > > Will both method `addCatalog` and `registerCatalog` be asynchronous > > registration? > > > > IIUC, in asynchronous registration, it may well that CREATE CATALOG > > executes successfully, but then the following CREATE TABLE statement will > > fail for the catalog fail to open. > > I think it's a break change which should be highlighted in this FLIP, may > > be in compatibility part. > > > > > > BTW, by saying asynchronous registration, I would like to expect there > > will be an executor to open or register catalog in the background, but > from > > your previous comments, > > "the Catalog instance will be initialized if it has not been initialized > > yet. If the initialization process fails, these statements will not be > > executed successfully." > > It looks more like lazy initialization for catalog than asynchronous > > registration, right? > > > > > > Best regards, > > Yuxia > > > > ----- 原始邮件 ----- > > 发件人: "Feng Jin" <jinfeng1...@gmail.com> > > 收件人: "dev" <dev@flink.apache.org> > > 发送时间: 星期一, 2023年 4 月 03日 下午 3:27:45 > > 主题: Re: [DISCUSS] FLIP 295: Support persistence of Catalog configuration > > and asynchronous registration > > > > Hi everyone, Thank you all for your interest in this DISCUSS. > > > > @Shammon > > > How to handle a catalog with the same name that exists for both of > them? > > > > I believe this is a crucial point. Based on my current personal > > understanding, the Map<String, Catalog> catalogs will serve as a cache > > for instantiated catalogs and have the highest priority. > > > > There are three methods that can affect the Map<String, Catalog> > catalogs: > > > > 1. registerCatalog(String catalogName, Catalog catalog) > > > > This method puts the catalog instance into the Map<String, Catalog> > > catalogs. > > > > 2. unregisterCatalog(String catalogName) > > > > This method removes the catalog instance corresponding to catalogName > > from the Map<String, Catalog> catalogs. > > > > 3. getCatalog(String catalogName) > > > > This method first retrieves the corresponding catalog instance from > > the Map<String, Catalog> catalogs. If the catalog does not exist, it > > retrieves the corresponding configuration from the CatalogStore, > > initializes it, and puts the initialized Catalog instance into the > > Map<String, Catalog> catalogs. > > > > The following two methods only modify the configuration in the > > CatalogStore: > > > > 1. addCatalog(String catalogName, Map<String, String> properties) > > > > This method saves the properties to the catalogStore and checks > > whether there is a catalogName with the same name. > > > > 2. removeCatalog(String catalogName) > > This method removes the specified configuration of the specified > > catalogName in the catalogStore. > > > > The following are possible conflict scenarios: > > > > 1. When the corresponding catalogName already exists in the > > CatalogStore but not in the Map<String, Catalog>, the > > registerCatalog(String catalogName, Catalog catalog) method can > > succeed and be directly saved to the Map<String, Catalog> catalogs. > > > > 2. When the corresponding catalogName already exists in both the > > CatalogStore and the Map<String, Catalog>, the registerCatalog(String > > catalogName, Catalog catalog) method will fail. > > > > 3. When the corresponding catalogName already exists in the > > Map<String, Catalog>, the addCatalog(String catalogName, Map<String, > > String> properties) method can directly save the properties to the > > catalogStore, but the getCatalog(String catalogName) method will not > > use the new properties for initialization because the corresponding > > catalog instance already exists in catalogs and will be prioritized. > > Therefore, using the unregisterCatalog(String catalogName) method to > > remove the instance corresponding to the original catalogName is > > necessary. > > > > > > > > > I think it will confuse users that `registerCatalog(String > > catalogName,Catalog catalog)` in the `Map<String, Catalog> catalogs` and > > `registerCatalog(String catalogName, Map<String, String> properties) > > > > This could potentially lead to confusion. I suggest changing the > > method name, perhaps to addCatalog(String catalogName, Map<String, > > String> properties), as previously mentioned > > > > > > @Hang > > > add `registerCatalog(String catalogName,Catalog catalog, > > boolean lazyInit)` method > > > > Since a catalog is already an instance, adding the "lazyInit" > > parameter to the registerCatalog(String catalogName, Catalog catalog) > > method may not necessarily result in lazy initialization. > > > > > Do we need to think about encryption > > > > I think encryption is necessary, but perhaps the encryption logic > > should be implemented in the CatalogStore when the data is actually > > saved. For instance, the FileCatalogStore could encrypt the data when > > it is saved to a file, while the MemoryCatalogStore does not require > > encryption. > > > > > Do we really need the `MemoryCatalogStore`? > > > > I think it is necessary to have a MemoryCatalogStore as the default > > implementation, which saves Catalog configurations in memory. > > Otherwise, if we want to implement asynchronous loading of Catalog, we > > would need to introduce additional cache. > > > > > > @Xianxun > > > > > What if asynchronous registration failed? > > > > This is also a critical concern. When executing DDL, DQL, or DML > > statements that reference a specific Catalog, the Catalog instance > > will be initialized if it has not been initialized yet. If the > > initialization process fails, these statements will not be executed > > successfully. > > > > > nt. The Map<String, Catalog> catalogs and CatalogStore catalogstore in > > the CatalogManager all have the same catalog name, but correponding to > > different catalog instances. > > > > This issue can be resolved by referring to the previous responses. The > > key principle is that the Catalog that has been most recently used > > will be given priority for subsequent use. > > > > On Sun, Apr 2, 2023 at 10:58 PM Xianxun Ye <yesorno828...@gmail.com> > > wrote: > > > > > > Hi Feng, > > > > > > Thanks for driving this Flip, I do believe this Flip could be helpful > > for users. The firm I work for also manages a lot of catalogs, and the > > submission of tasks becomes slow because of loading a number of catalogs. > > We obtain the catalogs in the user's Flink SQL through regular expression > > to avoid loading all catalogs to improve the speed of task submission. > > After reading flip-295, I have some questions: > > > > > > 1. When creating a catalog with CREATE CATALOG, the asynchronous > > registration method is used by default. What if asynchronous registration > > failed? And how to implement asynchronous registration? > > > > > > 2. I also have the same question with Shammon’s first comment. The > > Map<String, Catalog> catalogs and CatalogStore catalogstore in the > > CatalogManager all have the same catalog name, but correponding to > > different catalog instances. For example: catalog1-> jdbcCatalog in the > > catalogs, but catalog1 -> hiveCatalog in the catalogstore. Would this > case > > happen? Maybe we should define `catalogs` and `catalogstore` more > clearly. > > > > > > > > > Best regards, > > > > > > Xianxun > > > > > > > 2023年3月31日 11:13,Hang Ruan <ruanhang1...@gmail.com> 写道: > > > > > > > > MemoryCatalogStore > > > > > > > > >