Hi all, I would like to update you on the latest progress of the FLIP.

Last week, Leonard Xu, HangRuan, Jing Ge, Shammon FY, ShengKai Fang and I
had an offline discussion regarding the overall solution for Flink
CatalogStore. We have reached a consensus and I have updated the final
solution in FLIP.

Next, let me briefly describe the entire design:

   1.

   Introduce CatalogDescriptor to store catalog configuration similar to
   TableDescriptor.
   2.

   The two key functions of CatalogStore - void storeCatalog(String
   catalogName, CatalogDescriptor) and CatalogDescriptor getCatalog(String)
   will both use CatalogDescriptor instead of Catalog instance. This way,
   CatalogStore will only be responsible for saving and retrieving catalog
   configurations without having to initialize catalogs.
   3.

   The default registerCatalog(String catalogName, Catalog catalog)
   function in CatalogManager will be marked as deprecated.
   4.

   A new function registerCatalog(String catalogName, CatalogDescriptor
   catalog) will be added to serve as the default registration function for
   catalogs in CatalogManager.
   5.

   Map<String,Catalog> catalogs in CataloManager will remain unchanged and
   save initialized catalogs.This means that deletion operations from one
   session won't synchronize with other sessions.
   6.

   To support multi-session synchronization scenarios for deletions later
   on we should make Map<String,Catalog>catalogs configurable.There may be
   three possible situations:

   a.Default caching of all initialized catalogs

   b.Introduction of LRU cache logic which can automatically clear
   long-unused catalogs.

   c.No caching of any instances; each call to getCatalog creates a new
   instance.


This is the document for discussion:
https://docs.google.com/document/d/1HRJNd4_id7i6cUxGnAybmYZIwl5g1SmZCOzGdUz-6lU/edit

This is the final proposal document:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations


Thank you very much for your attention and suggestions on this FLIP.  A
special thanks to Hang Ruan for his suggestions on the entire design and
organizing offline discussions.

If you have any further suggestions or feedback about this FLIP please feel
free to share.


Best,

Feng

On Sat, May 6, 2023 at 8:32 PM Jing Ge <j...@ververica.com.invalid> wrote:

> Hi Feng,
>
> Thanks for improving the FLIP. It looks good to me. We could still
> reconsider in the future how to provide more common built-in cache
> functionality in CatalogManager so that not every CatalogSotre
> implementation has to take care of it.
>
> Best regards,
> Jing
>
> On Thu, May 4, 2023 at 1:47 PM Feng Jin <jinfeng1...@gmail.com> wrote:
>
> > Hi Jing,
> >
> > Thanks for your reply.
> >
> > >  There might be more such issues. I would suggest you completely walk
> > through the FLIP again and fix those issues
> >
> > I am very sorry for my carelessness and at the same time, I greatly
> > appreciate your careful review. I have thoroughly checked the entire FLIP
> > and made corrections to these issues.
> >
> >
> > >  If I am not mistaken, with the  current FLIP design, CatalogManager
> > could work without Optional  CatalogStore being configured.
> >
> > Yes, in the original design, CatalogStore was not necessary because
> > CatalogManager used Map<String, Catalog> catalogs to store catalog
> > instances.
> > However, this caused inconsistency issues. Therefore, I modified this
> part
> > of the design and removed Map<String, Catalog> catalogs from
> > CatalogManager.
> >  At the same time, InMemoryCatalog will serve as the default CatalogStore
> > to save catalogs in memory and replace the functionality of
> > Map<String,Catalog>catalogs.
> > The previous plan that kept Map<String,Catalog>catalogs has been moved to
> > Rejected Alternatives.
> >
> >
> >
> > Best,
> > Feng
> >
> > On Sun, Apr 30, 2023 at 9:03 PM Jing Ge <j...@ververica.com.invalid>
> > wrote:
> >
> > > Hi Feng,
> > >
> > > There are still many places contain inconsistent content, e.g.
> > >
> > > 1. "Asynchronous registration" is still used.
> > > 2. The java comment of the method registerCatalog(String catalogName,
> > > Catalog catalog) in CatalogManager does not tell what the method is
> > doing.
> > >
> > >
> > > There might be more such issues. I would suggest you completely walk
> > > through the FLIP again and fix those issues.
> > >
> > > About the inMemoryCatalogStore, do you mean that you will build the
> cache
> > > functionality in the CatalogStore? This is a very different design
> > concept
> > > from what the current FLIP described. If I am not mistaken, with the
> > > current FLIP design, CatalogManager could work without Optional
> > > CatalogStore being configured. That is the reason why I mentioned in
> the
> > > last email that the example code wrt the Optional CatalogStore is not
> > > correct.
> > >
> > > Best regards,
> > > Jing
> > >
> > > On Thu, Apr 27, 2023 at 3:55 PM Feng Jin <jinfeng1...@gmail.com>
> wrote:
> > >
> > > > Hi Jing
> > > >
> > > >
> > > > > There are still some NIT issues in the FLIP
> > > >
> > > > Thank you very much for the careful review. I have already made the
> > > > relevant changes.
> > > >
> > > >
> > > > >  Speaking of the conflict issues in the multi-instance scenarios, I
> > am
> > > > not
> > > > sure if this is the intended behaviour
> > > >
> > > > Currently, there are conflicts in multiple scenarios with the current
> > > > design. I am thinking whether we should remove 'Map<String, Catalog>'
> > and
> > > > make Cache the default behavior of InMemoryCatalogStore. This way,
> > users
> > > > can implement their own CatalogStore to achieve multi-instance
> > > > non-conflicting scenarios. What do you think?
> > > >
> > > >
> > > >
> > > > Best,
> > > > Feng
> > > >
> > > > On Thu, Apr 27, 2023 at 9:03 PM Jing Ge <j...@ververica.com.invalid>
> > > > wrote:
> > > >
> > > > > Hi Feng,
> > > > >
> > > > > Thanks for working on the FLIP. There are still some NIT issues in
> > the
> > > > FLIP
> > > > > like:
> > > > >
> > > > > 1. Optional<CatalogStore> catalogStore has been used as
> CatalogStore
> > > > > instead of Optional in the code example. It should be fine to use
> it
> > as
> > > > > pseudo code for now and update it after you submit the PR.
> > > > > 2. addCatalog(...) is still used somewhere in the rejected section
> > > which
> > > > > should be persistContext(...) to keep it consistent.
> > > > >
> > > > > Speaking of the conflict issues in the multi-instance scenarios, I
> am
> > > not
> > > > > sure if this is the intended behaviour. If Map<String, Catalog>
> > > catalogs
> > > > is
> > > > > used as a cache, it should be invalid, once the related catalog has
> > > been
> > > > > removed from the CatalogStore by another instance. Did I miss
> > > something?
> > > > >
> > > > > Best regards,
> > > > > Jing
> > > > >
> > > > > On Thu, Apr 13, 2023 at 4:40 PM Feng Jin <jinfeng1...@gmail.com>
> > > wrote:
> > > > >
> > > > > > Hi Jing,Shammon
> > > > > > Thanks for your reply.
> > > > > >
> > > > > > @Jing
> > > > > >
> > > > > > > How about persistCatalog()?
> > > > > > I think this is a good function name, I have updated it in the
> > > > > > documentation.
> > > > > >
> > > > > > >    Some common cache features should be implemented
> > > > > > Thank you for the suggestion. If alternative 1 turns out to be
> more
> > > > > > appropriate later, I will improve this part of the content.
> > > > > >
> > > > > > > As the above discussion moves forward, the option 2 solution
> > looks
> > > > more
> > > > > > like a replacement of option 1
> > > > > > Yes, after discussing with Shammon offline, we think that
> solution
> > 2
> > > > > might
> > > > > > be more suitable and also avoid any inconsistency issues.
> > > > > >
> > > > > > > There are some inconsistent descriptions in the content.  Would
> > you
> > > > > like
> > > > > > to clean them up?
> > > > > > I will do my best to improve the document and appreciate your
> > > > > suggestions.
> > > > > >
> > > > > >
> > > > > >
> > > > > > @Shammon
> > > > > > > can you put the unselected option in `Rejected Alternatives`
> > > > > > Sure, I have moved it to the `Rejected Alternatives`.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Best
> > > > > > Feng
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Apr 13, 2023 at 8:52 AM Shammon FY <zjur...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Hi Feng
> > > > > > >
> > > > > > > Thanks for your update.
> > > > > > >
> > > > > > > I found there are two options in `Proposed Changes`, can you
> put
> > > the
> > > > > > > unselected option in `Rejected Alternatives`? I think this may
> > help
> > > > us
> > > > > > > better understand your proposal
> > > > > > >
> > > > > > >
> > > > > > > Best,
> > > > > > > Shammon FY
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Apr 13, 2023 at 4:49 AM Jing Ge
> > <j...@ververica.com.invalid
> > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Feng,
> > > > > > > >
> > > > > > > > Thanks for raising this FLIP. I am still confused after
> > > completely
> > > > > > > reading
> > > > > > > > the thread with following questions:
> > > > > > > >
> > > > > > > > 1. Naming confusion - registerCatalog() and addCatalog() have
> > no
> > > > big
> > > > > > > > difference based on their names. One of them is responsible
> for
> > > > data
> > > > > > > > persistence. How about persistCatalog()?
> > > > > > > > 2. As you mentioned that Map<String, Catalog> catalogs is
> used
> > > as a
> > > > > > cache
> > > > > > > > and catalogStore is used for data persistence. I would
> suggest
> > > > > > describing
> > > > > > > > their purpose conceptually and clearly in the FLIP. Some
> common
> > > > cache
> > > > > > > > features should be implemented, i.e. data in the cache and
> the
> > > > store
> > > > > > > should
> > > > > > > > be consistent. Same Catalog instance should be found in the
> > store
> > > > and
> > > > > > in
> > > > > > > > the cache(either it has been initialized or it will be lazy
> > > > > > initialized)
> > > > > > > > for the same catalog name. The consistency will be taken care
> > of
> > > > > while
> > > > > > > > updating the catalog.
> > > > > > > > 3. As the above discussion moves forward, the option 2
> solution
> > > > looks
> > > > > > > more
> > > > > > > > like a replacement of option 1, because, afaiu, issues
> > mentioned
> > > > > > > > previously with option 1 are not solved yet. Do you still
> want
> > to
> > > > > > propose
> > > > > > > > both options and ask for suggestions for both of them?
> > > > > > > > 4. After you updated the FLIP, there are some inconsistent
> > > > > descriptions
> > > > > > > in
> > > > > > > > the content.  Would you like to clean them up? Thanks!
> > > > > > > >
> > > > > > > > Best regards,
> > > > > > > > Jing
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Apr 7, 2023 at 9:24 AM Feng Jin <
> jinfeng1...@gmail.com
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > > hi Shammon
> > > > > > > > >
> > > > > > > > > Thank you for your response, and I completely agree with
> your
> > > > point
> > > > > > of
> > > > > > > > > view.
> > > > > > > > > Initially, I may have over complicated the whole issue.
> First
> > > and
> > > > > > > > foremost,
> > > > > > > > > we need to consider the persistence of the Catalog's
> > > > Configuration.
> > > > > > > > > If we only need to provide persistence for Catalog
> > > Configuration,
> > > > > we
> > > > > > > can
> > > > > > > > > add a toConfiguration method to the Catalog interface.
> > > > > > > > > This method can convert a Catalog instance to a Map<String,
> > > > String>
> > > > > > > > > properties, and the default implementation will throw an
> > > > exception.
> > > > > > > > >
> > > > > > > > > public interface Catalog {
> > > > > > > > >    /**
> > > > > > > > >    * Returns a map containing the properties of the catalog
> > > > object.
> > > > > > > > >    *
> > > > > > > > >    * @return a map containing the properties of the catalog
> > > > object
> > > > > > > > >    * @throws UnsupportedOperationException if the
> > implementing
> > > > > class
> > > > > > > does
> > > > > > > > > not override
> > > > > > > > >    * the default implementation of this method
> > > > > > > > >    */
> > > > > > > > >   default Map<String, String> toProperties() {
> > > > > > > > >     throw new UnsupportedOperationException("Please
> implement
> > > > > > > > toProperties
> > > > > > > > > for this catalog");
> > > > > > > > >   }
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > The specific process is as follows:
> > > > > > > > >
> > > > > > > > > 1.  If the user has configured a CatalogStore, the
> > > > > toConfiguration()
> > > > > > > > method
> > > > > > > > > will be called when registering a Catalog instance with
> > > > > > > > > registerCatalog(String catalogName, Catalog catalog). The
> > > Catalog
> > > > > > > > instance
> > > > > > > > > will be converted to a Map<String, String> properties and
> > saved
> > > > to
> > > > > > the
> > > > > > > > > CatalogStore. If some Catalog instances do not implement
> this
> > > > > method,
> > > > > > > an
> > > > > > > > > exception will be thrown.
> > > > > > > > >
> > > > > > > > > 2.  If the user does not use a CatalogStore, the
> > > > toConfiguration()
> > > > > > > method
> > > > > > > > > will not be called, ensuring consistency with the original
> > > > process.
> > > > > > > > >
> > > > > > > > > 3. Saving both the Map<String, Catalog> catalogs and the
> > > > > CatalogStore
> > > > > > > at
> > > > > > > > > the same time can also avoid conflicts
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > For lazy initialization:
> > > > > > > > > we can start from the Catalog itself and provide a
> dedicated
> > > > > Catalog
> > > > > > > > > implementation for lazy loading, such as
> ConfigurationCatalog
> > > > > > > > >
> > > > > > > > > public class ConfigurationCatalog implements Catalog {
> > > > > > > > > ConfigurationCatalog(Map<String, String> properties) {
> > > > > > > > > }
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > I have added this design to the FLIP.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Feng
> > > > > > > > >
> > > > > > > > > On Thu, Apr 6, 2023 at 10:03 AM Shammon FY <
> > zjur...@gmail.com>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Feng
> > > > > > > > > >
> > > > > > > > > > Thanks for your answer.
> > > > > > > > > >
> > > > > > > > > > I have no questions about the functionality of
> > > `CatalogStore`,
> > > > > but
> > > > > > > the
> > > > > > > > > > different operations of multiple `registerCatalog` or
> > > > > > `storeCatalog`
> > > > > > > in
> > > > > > > > > > `CatalogManager`. The implementation in Trino is also the
> > > same,
> > > > > the
> > > > > > > > > > `CatalogManager` in trino only has one `createCatalog`,
> > which
> > > > > will
> > > > > > > save
> > > > > > > > > the
> > > > > > > > > > catalog to memory and `CatalogStore`.
> > > > > > > > > >
> > > > > > > > > > I think we don't need to add another `registerCatalog` or
> > > > > > > `addCatalog`
> > > > > > > > to
> > > > > > > > > > save a catalog in `Map<String, Catalog> catalogs` or
> > > > > `CatalogStore
> > > > > > > > > > catalogStore`.  As you mentioned before, the `Map<String,
> > > > > Catalog>
> > > > > > > > > > catalogs` is a cache in `CatalogManager`. How about
> saving
> > > the
> > > > > > > catalog
> > > > > > > > in
> > > > > > > > > > `Map<String, Catalog> catalogs` and `CatalogStore
> > > catalogStore`
> > > > > > > > together
> > > > > > > > > > when it is registered or added in `CatalogManager`?
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Shammon FY
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Tue, Apr 4, 2023 at 5:22 PM Feng Jin <
> > > jinfeng1...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Thank you for your reply. I am very sorry for the
> > > > > > misunderstanding
> > > > > > > > > caused
> > > > > > > > > > > by my deviation from the original discussion.
> > > > > > > > > > >
> > > > > > > > > > > @Shammon
> > > > > > > > > > > > I found there is a pre-discussion [1] for this FLIP
> > > > > > > > > > > Yes, there was indeed such a discussion before.
> However,
> > > in
> > > > > > > > designing
> > > > > > > > > > the
> > > > > > > > > > > whole solution, I found that the logic of
> CatalogManager
> > > > itself
> > > > > > > > doesn't
> > > > > > > > > > > need to change much. *We cannot only persist Catalog
> > > > instances
> > > > > > > > > > themselves*,
> > > > > > > > > > > so exposing only registerCatalog(String catalogName,
> > > Catalog
> > > > > > > catalog)
> > > > > > > > > > might
> > > > > > > > > > > not be enough to save Catalogs, because in the end we
> > still
> > > > > need
> > > > > > to
> > > > > > > > > save
> > > > > > > > > > > the configurations corresponding to the Catalog
> > instances.
> > > > > > > > Therefore,
> > > > > > > > > I
> > > > > > > > > > > decided to introduce the CatalogStore interface for
> > > > > configuration
> > > > > > > > > > > persistence. Regarding this part, I also took
> inspiration
> > > > from
> > > > > > > > Trino's
> > > > > > > > > > > implementation[1].
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > @yuxia
> > > > > > > > > > >
> > > > > > > > > > > > 1: The mechanism of handling catalog with the same
> name
> > > > > looks a
> > > > > > > > > little
> > > > > > > > > > of
> > > > > > > > > > > complex to me.
> > > > > > > > > > > Thank you for the suggestion. I will provide a detailed
> > > > > > description
> > > > > > > > and
> > > > > > > > > > > code examples for this part, and add it to the FLIP.
> > > > > > > > > > >
> > > > > > > > > > > > 2: TBH, the method name `addCatalog` still looks
> > confused
> > > > to
> > > > > > me.
> > > > > > > > > IIUC,
> > > > > > > > > > > this method is for storing catalog to CatalogStore, how
> > > about
> > > > > > > > renaming
> > > > > > > > > it
> > > > > > > > > > > to `storeCatalog`? It's very personal opinion, you can
> > > decide
> > > > > to
> > > > > > > take
> > > > > > > > > it
> > > > > > > > > > or
> > > > > > > > > > > not by your self.
> > > > > > > > > > > StoreCatalog looks more intuitive to me. I don't see
> any
> > > > > problem
> > > > > > > with
> > > > > > > > > it.
> > > > > > > > > > >
> > > > > > > > > > > > 3.For CREATE CATALOG statement, which method will be
> > > > called?
> > > > > > > > > > > `registerCatalog` or `addCatalog`? I'm wondering
> whether
> > > user
> > > > > can
> > > > > > > > add a
> > > > > > > > > > > catalog to store with SQL stement.
> > > > > > > > > > > For CREATE CATALOG, my original design was to add it
> > > directly
> > > > > to
> > > > > > > the
> > > > > > > > > > > CatalogStore, but this would disrupt the existing
> logic.
> > > > > > > Therefore, I
> > > > > > > > > > think
> > > > > > > > > > > we can do both: save the configuration to the
> > CatalogStore
> > > > and
> > > > > > > > > initialize
> > > > > > > > > > > the Catalog instance at the same time
> > > > > > > > > > >
> > > > > > > > > > > > 3. Is it really neccessary to provide a default
> > > > implmentation
> > > > > > for
> > > > > > > > > > > interface `CatalogStoreFactory`?
> > > > > > > > > > > I think it is necessary, otherwise we would need to
> > > introduce
> > > > > an
> > > > > > > > > > additional
> > > > > > > > > > > Map to store the configuration for lazy loading.
> > > > > > > > > > >
> > > > > > > > > > > > 4: About asynchronous registration for catalog.
> > > > > > > > > > > I don't think registerCatalog(String catalogName,
> Catalog
> > > > > > catalog)
> > > > > > > > can
> > > > > > > > > be
> > > > > > > > > > > made into an asynchronous interface because Catalog is
> > > > already
> > > > > an
> > > > > > > > > > instance.
> > > > > > > > > > >
> > > > > > > > > > > > It looks more like lazy initialization for catalog
> than
> > > > > > > > asynchronous
> > > > > > > > > > > registration, right?
> > > > > > > > > > > Indeed, my description was inaccurate. It should be
> lazy
> > > > > > > registration
> > > > > > > > > > > instead of asynchronous registration. I have already
> > > updated
> > > > > the
> > > > > > > > title
> > > > > > > > > of
> > > > > > > > > > > the FLIP.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > [1].
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/trinodb/trino/blob/master/core/trino-main/src/main/java/io/trino/connector/CatalogStore.java
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Apr 4, 2023 at 4:27 PM Shammon FY <
> > > zjur...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Feng
> > > > > > > > > > > >
> > > > > > > > > > > > I think if there is a `registerCatalog` method in
> > > > > > > `CatalogManager`,
> > > > > > > > > it
> > > > > > > > > > > will
> > > > > > > > > > > > confuse users whether a method named `addCatalog` or
> > > > > > > `storeCatalog`
> > > > > > > > > is
> > > > > > > > > > > > added.
> > > > > > > > > > > >
> > > > > > > > > > > > And as you mentioned, the memory catalog is a
> `cache`,
> > I
> > > > > think
> > > > > > > the
> > > > > > > > > > > concept
> > > > > > > > > > > > of `cache` should not be exposed to users.
> > > > > > > > > > > >
> > > > > > > > > > > > I found there is a pre-discussion [1] for this FLIP.
> > > Please
> > > > > > > correct
> > > > > > > > > me
> > > > > > > > > > if
> > > > > > > > > > > > I'm wrong, IIUC, the conclusion of that discussion is
> > to
> > > > use
> > > > > > > > > > > > `CatalogManager` as an interface and implement it for
> > > > > different
> > > > > > > > > stores
> > > > > > > > > > > such
> > > > > > > > > > > > as memory, file and external system.
> > > > > > > > > > > >
> > > > > > > > > > > > I think there is a gap between the current FLIP
> design
> > > and
> > > > > that
> > > > > > > > > > > conclusion.
> > > > > > > > > > > > What about the proposal of the discussion in thread
> > [1] ?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > [1]
> > > > > > > >
> > https://lists.apache.org/thread/9bnjblgd9wvrl75lkm84oo654c4lqv70
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Shammon FY
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Apr 4, 2023 at 3:41 PM yuxia <
> > > > > > > luoyu...@alumni.sjtu.edu.cn>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Thanks Feng for driving this FLIP.
> > > > > > > > > > > > > I have few comments:
> > > > > > > > > > > > > 1: The mechanism of handling catalog with the same
> > name
> > > > > > looks a
> > > > > > > > > > little
> > > > > > > > > > > of
> > > > > > > > > > > > > complex to me. I think it'll be better to explain
> it
> > in
> > > > the
> > > > > > > java
> > > > > > > > > doc
> > > > > > > > > > of
> > > > > > > > > > > > > these methods and give a brief example in this
> FLIP.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2: TBH, the method name `addCatalog` still looks
> > > confused
> > > > > to
> > > > > > > me.
> > > > > > > > > > IIUC,
> > > > > > > > > > > > > this method is for storing catalog to CatalogStore,
> > how
> > > > > about
> > > > > > > > > > renaming
> > > > > > > > > > > it
> > > > > > > > > > > > > to `storeCatalog`? It's very personal opinion, you
> > can
> > > > > decide
> > > > > > > to
> > > > > > > > > take
> > > > > > > > > > > it
> > > > > > > > > > > > or
> > > > > > > > > > > > > not by your self.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 3: For CREATE CATALOG statement, which method will
> be
> > > > > called?
> > > > > > > > > > > > > `registerCatalog` or `addCatalog`? I'm wondering
> > > whether
> > > > > user
> > > > > > > can
> > > > > > > > > > add a
> > > > > > > > > > > > > catalog to store with SQL stement.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > 3: Is it really neccessary to provide a default
> > > > > implmentation
> > > > > > > for
> > > > > > > > > > > > > interface `CatalogStoreFactory`?
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > 4: About asynchronous registration for catalog.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > When creating a catalog with CREATE CATALOG, the
> > > > > > asynchronous
> > > > > > > > > > > > > registration method is used by default.
> > > > > > > > > > > > > If asynchronous registration is the default
> behavior,
> > > it
> > > > > > there
> > > > > > > > any
> > > > > > > > > > way
> > > > > > > > > > > > > that user can switch to synchronous registration
> just
> > > > like
> > > > > > > > before?
> > > > > > > > > > > > > Will both method `addCatalog` and `registerCatalog`
> > be
> > > > > > > > asynchronous
> > > > > > > > > > > > > registration?
> > > > > > > > > > > > >
> > > > > > > > > > > > > IIUC, in asynchronous registration, it may well
> that
> > > > CREATE
> > > > > > > > CATALOG
> > > > > > > > > > > > > executes successfully, but then the following
> CREATE
> > > > TABLE
> > > > > > > > > statement
> > > > > > > > > > > will
> > > > > > > > > > > > > fail for the catalog fail to open.
> > > > > > > > > > > > > I think it's a break change which should be
> > highlighted
> > > > in
> > > > > > this
> > > > > > > > > FLIP,
> > > > > > > > > > > may
> > > > > > > > > > > > > be in compatibility part.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > BTW, by saying asynchronous registration, I would
> > like
> > > to
> > > > > > > expect
> > > > > > > > > > there
> > > > > > > > > > > > > will be an executor to open or register catalog in
> > the
> > > > > > > > background,
> > > > > > > > > > but
> > > > > > > > > > > > from
> > > > > > > > > > > > > your previous comments,
> > > > > > > > > > > > > "the Catalog instance will be initialized if it has
> > not
> > > > > been
> > > > > > > > > > > initialized
> > > > > > > > > > > > > yet. If the initialization process fails, these
> > > > statements
> > > > > > will
> > > > > > > > not
> > > > > > > > > > be
> > > > > > > > > > > > > executed successfully."
> > > > > > > > > > > > > It looks more like lazy initialization for catalog
> > than
> > > > > > > > > asynchronous
> > > > > > > > > > > > > registration, right?
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best regards,
> > > > > > > > > > > > > Yuxia
> > > > > > > > > > > > >
> > > > > > > > > > > > > ----- 原始邮件 -----
> > > > > > > > > > > > > 发件人: "Feng Jin" <jinfeng1...@gmail.com>
> > > > > > > > > > > > > 收件人: "dev" <dev@flink.apache.org>
> > > > > > > > > > > > > 发送时间: 星期一, 2023年 4 月 03日 下午 3:27:45
> > > > > > > > > > > > > 主题: Re: [DISCUSS] FLIP 295: Support persistence of
> > > > Catalog
> > > > > > > > > > > configuration
> > > > > > > > > > > > > and asynchronous registration
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hi everyone, Thank you all for your interest in
> this
> > > > > DISCUSS.
> > > > > > > > > > > > >
> > > > > > > > > > > > > @Shammon
> > > > > > > > > > > > > > How to handle a catalog with the same name that
> > > exists
> > > > > for
> > > > > > > both
> > > > > > > > > of
> > > > > > > > > > > > them?
> > > > > > > > > > > > >
> > > > > > > > > > > > > I believe this is a crucial point. Based on my
> > current
> > > > > > personal
> > > > > > > > > > > > > understanding, the Map<String, Catalog> catalogs
> will
> > > > serve
> > > > > > as
> > > > > > > a
> > > > > > > > > > cache
> > > > > > > > > > > > > for instantiated catalogs and have the highest
> > > priority.
> > > > > > > > > > > > >
> > > > > > > > > > > > > There are three methods that can affect the
> > Map<String,
> > > > > > > Catalog>
> > > > > > > > > > > > catalogs:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1. registerCatalog(String catalogName, Catalog
> > catalog)
> > > > > > > > > > > > >
> > > > > > > > > > > > > This method puts the catalog instance into the
> > > > Map<String,
> > > > > > > > Catalog>
> > > > > > > > > > > > > catalogs.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2. unregisterCatalog(String catalogName)
> > > > > > > > > > > > >
> > > > > > > > > > > > > This method removes the catalog instance
> > corresponding
> > > to
> > > > > > > > > catalogName
> > > > > > > > > > > > > from the Map<String, Catalog> catalogs.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 3. getCatalog(String catalogName)
> > > > > > > > > > > > >
> > > > > > > > > > > > > This method first retrieves the corresponding
> catalog
> > > > > > instance
> > > > > > > > from
> > > > > > > > > > > > > the Map<String, Catalog> catalogs. If the catalog
> > does
> > > > not
> > > > > > > exist,
> > > > > > > > > it
> > > > > > > > > > > > > retrieves the corresponding configuration from the
> > > > > > > CatalogStore,
> > > > > > > > > > > > > initializes it, and puts the initialized Catalog
> > > instance
> > > > > > into
> > > > > > > > the
> > > > > > > > > > > > > Map<String, Catalog> catalogs.
> > > > > > > > > > > > >
> > > > > > > > > > > > > The following two methods only modify the
> > configuration
> > > > in
> > > > > > the
> > > > > > > > > > > > > CatalogStore:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1. addCatalog(String catalogName, Map<String,
> String>
> > > > > > > properties)
> > > > > > > > > > > > >
> > > > > > > > > > > > > This method saves the properties to the
> catalogStore
> > > and
> > > > > > checks
> > > > > > > > > > > > > whether there is a catalogName with the same name.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2. removeCatalog(String catalogName)
> > > > > > > > > > > > > This method removes the specified configuration of
> > the
> > > > > > > specified
> > > > > > > > > > > > > catalogName in the catalogStore.
> > > > > > > > > > > > >
> > > > > > > > > > > > > The following are possible conflict scenarios:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1. When the corresponding catalogName already
> exists
> > in
> > > > the
> > > > > > > > > > > > > CatalogStore but not in the Map<String, Catalog>,
> the
> > > > > > > > > > > > > registerCatalog(String catalogName, Catalog
> catalog)
> > > > method
> > > > > > can
> > > > > > > > > > > > > succeed and be directly saved to the Map<String,
> > > Catalog>
> > > > > > > > catalogs.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2. When the corresponding catalogName already
> exists
> > in
> > > > > both
> > > > > > > the
> > > > > > > > > > > > > CatalogStore and the Map<String, Catalog>, the
> > > > > > > > > registerCatalog(String
> > > > > > > > > > > > > catalogName, Catalog catalog) method will fail.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 3. When the corresponding catalogName already
> exists
> > in
> > > > the
> > > > > > > > > > > > > Map<String, Catalog>, the addCatalog(String
> > > catalogName,
> > > > > > > > > Map<String,
> > > > > > > > > > > > > String> properties) method can directly save the
> > > > properties
> > > > > > to
> > > > > > > > the
> > > > > > > > > > > > > catalogStore, but the getCatalog(String
> catalogName)
> > > > method
> > > > > > > will
> > > > > > > > > not
> > > > > > > > > > > > > use the new properties for initialization because
> the
> > > > > > > > corresponding
> > > > > > > > > > > > > catalog instance already exists in catalogs and
> will
> > be
> > > > > > > > > prioritized.
> > > > > > > > > > > > > Therefore, using the unregisterCatalog(String
> > > > catalogName)
> > > > > > > method
> > > > > > > > > to
> > > > > > > > > > > > > remove the instance corresponding to the original
> > > > > catalogName
> > > > > > > is
> > > > > > > > > > > > > necessary.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > I think it will confuse users that
> > > > > `registerCatalog(String
> > > > > > > > > > > > > catalogName,Catalog catalog)` in the `Map<String,
> > > > Catalog>
> > > > > > > > > catalogs`
> > > > > > > > > > > and
> > > > > > > > > > > > > `registerCatalog(String catalogName, Map<String,
> > > String>
> > > > > > > > > properties)
> > > > > > > > > > > > >
> > > > > > > > > > > > > This could potentially lead to confusion. I suggest
> > > > > changing
> > > > > > > the
> > > > > > > > > > > > > method name, perhaps to addCatalog(String
> > catalogName,
> > > > > > > > Map<String,
> > > > > > > > > > > > > String> properties), as previously mentioned
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > @Hang
> > > > > > > > > > > > > > add `registerCatalog(String catalogName,Catalog
> > > > catalog,
> > > > > > > > > > > > > boolean lazyInit)` method
> > > > > > > > > > > > >
> > > > > > > > > > > > > Since a catalog is already an instance, adding the
> > > > > "lazyInit"
> > > > > > > > > > > > > parameter to the registerCatalog(String
> catalogName,
> > > > > Catalog
> > > > > > > > > catalog)
> > > > > > > > > > > > > method may not necessarily result in lazy
> > > initialization.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Do we need to think about encryption
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think encryption is necessary, but perhaps the
> > > > encryption
> > > > > > > logic
> > > > > > > > > > > > > should be implemented in the CatalogStore when the
> > data
> > > > is
> > > > > > > > actually
> > > > > > > > > > > > > saved. For instance, the FileCatalogStore could
> > encrypt
> > > > the
> > > > > > > data
> > > > > > > > > when
> > > > > > > > > > > > > it is saved to a file, while the MemoryCatalogStore
> > > does
> > > > > not
> > > > > > > > > require
> > > > > > > > > > > > > encryption.
> > > > > > > > > > > > >
> > > > > > > > > > > > > >  Do we really need the `MemoryCatalogStore`?
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think it is necessary to have a
> MemoryCatalogStore
> > as
> > > > the
> > > > > > > > default
> > > > > > > > > > > > > implementation, which saves Catalog configurations
> in
> > > > > memory.
> > > > > > > > > > > > > Otherwise, if we want to implement asynchronous
> > loading
> > > > of
> > > > > > > > Catalog,
> > > > > > > > > > we
> > > > > > > > > > > > > would need to introduce additional cache.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > @Xianxun
> > > > > > > > > > > > >
> > > > > > > > > > > > > > What if asynchronous registration failed?
> > > > > > > > > > > > >
> > > > > > > > > > > > > This is also a critical concern.  When executing
> DDL,
> > > > DQL,
> > > > > or
> > > > > > > DML
> > > > > > > > > > > > > statements that reference a specific Catalog, the
> > > Catalog
> > > > > > > > instance
> > > > > > > > > > > > > will be initialized if it has not been initialized
> > yet.
> > > > If
> > > > > > the
> > > > > > > > > > > > > initialization process fails, these statements will
> > not
> > > > be
> > > > > > > > executed
> > > > > > > > > > > > > successfully.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > nt. The Map<String, Catalog> catalogs and
> > > CatalogStore
> > > > > > > > > catalogstore
> > > > > > > > > > > in
> > > > > > > > > > > > > the CatalogManager all have the same catalog name,
> > but
> > > > > > > > correponding
> > > > > > > > > > to
> > > > > > > > > > > > > different catalog instances.
> > > > > > > > > > > > >
> > > > > > > > > > > > > This issue can be resolved by referring to the
> > previous
> > > > > > > > responses.
> > > > > > > > > > The
> > > > > > > > > > > > > key principle is that the Catalog that has been
> most
> > > > > recently
> > > > > > > > used
> > > > > > > > > > > > > will be given priority for subsequent use.
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Sun, Apr 2, 2023 at 10:58 PM Xianxun Ye <
> > > > > > > > > yesorno828...@gmail.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Feng,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks for driving this Flip, I do believe this
> > Flip
> > > > > could
> > > > > > be
> > > > > > > > > > helpful
> > > > > > > > > > > > > for users. The firm I work for also manages a lot
> of
> > > > > > catalogs,
> > > > > > > > and
> > > > > > > > > > the
> > > > > > > > > > > > > submission of tasks becomes slow because of
> loading a
> > > > > number
> > > > > > of
> > > > > > > > > > > catalogs.
> > > > > > > > > > > > > We obtain the catalogs in the user's Flink SQL
> > through
> > > > > > regular
> > > > > > > > > > > expression
> > > > > > > > > > > > > to avoid loading all catalogs to improve the speed
> of
> > > > task
> > > > > > > > > > submission.
> > > > > > > > > > > > > After reading flip-295, I have some questions:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1. When creating a catalog with CREATE CATALOG,
> the
> > > > > > > > asynchronous
> > > > > > > > > > > > > registration method is used by default. What if
> > > > > asynchronous
> > > > > > > > > > > registration
> > > > > > > > > > > > > failed? And how to implement asynchronous
> > registration?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2. I also have the same question with Shammon’s
> > first
> > > > > > > comment.
> > > > > > > > > The
> > > > > > > > > > > > > Map<String, Catalog> catalogs and CatalogStore
> > > > catalogstore
> > > > > > in
> > > > > > > > the
> > > > > > > > > > > > > CatalogManager all have the same catalog name, but
> > > > > > correponding
> > > > > > > > to
> > > > > > > > > > > > > different catalog instances. For example:
> catalog1->
> > > > > > > jdbcCatalog
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > > catalogs, but catalog1 -> hiveCatalog in the
> > > > catalogstore.
> > > > > > > Would
> > > > > > > > > this
> > > > > > > > > > > > case
> > > > > > > > > > > > > happen? Maybe we should define `catalogs` and
> > > > > `catalogstore`
> > > > > > > more
> > > > > > > > > > > > clearly.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best regards,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Xianxun
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2023年3月31日 11:13,Hang Ruan <
> > ruanhang1...@gmail.com
> > > >
> > > > > 写道:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > MemoryCatalogStore
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to