+1 on Poraris entering to Apache. I'm interested in helping as a mentor if 
needed.

Kent Yao

On 2024/07/31 07:02:05 ConradJam wrote:
> As members of the Amoro project, our team is thrilled to see the growing
> attention towards Amoro.
> 
> We are excited about Polaris becoming open source, as it opens up greater
> possibilities for future collaboration with the Amoro community.
> 
> Amoro focuses on data lake formats and aims to provide optimization
> services and enhancements for the lake. Our primary goal is to offer
> optimization services that support multiple table formats (though
> currently, Iceberg is the most supported), such as small file optimization,
> Z-order sorting optimization, and future index optimization.
> 
> Amoro provides both Internal Catalog and External Catalog methods to
> optimize lake tables. To gather optimization information, we have conducted
> some catalog management work.
> 
> I often hear people comparing Gravitino and Polaris as potential
> competitors to Amoro, which I think is a misconception (I noticed that some
> previous discussions about Amoro's positioning seemed unclear, so I wanted
> to clarify this).
> 
> While there might be some overlap between Amoro, Gravitino, and Polaris:
> 
> - Gravitino focuses on unified metadata management across various areas,
> including Kafka and AI, not just on data lakes.
> - Polaris is an interoperable, open-source catalog for Apache Iceberg.
> 
> If there are any errors, please correct them.
> 
> Amoro plans to support both Polaris and Gravitino in the future.
> Additionally, the Amoro community will continue to engage with the
> Gravitino and Polaris communities to foster more collaborative efforts in
> lake optimization.
> 
> [1] Amoro docs: https://amoro.apache.org/docs/latest/
> [2] Gravitino docs:  https://datastrato.ai/docs/0.5.1/
> [3] Polaris docs: https://polaris.io/
> 
> Jack Ye <yezhao...@gmail.com> 于2024年7月31日周三 13:22写道:
> 
> > > What's the difference between this project and Amoro
> >
> > Here is my $0.01, please correct me if I am wrong, especially for people
> > working on Amoro and Gravitino.
> >
> > I think Apache Amoro is focused more on being a self-contained complete
> > data lakehouse management and ingestion system. It is a complete solution
> > with its own connectors in engines like Spark [1], and customized
> > mixed-format integrations in engines like Trino [2]. Polaris is mostly
> > focused on the data catalog aspect of a data lakehouse, and offers an open
> > source vendor-neutral Iceberg catalog with additional governance support.
> > By integrating with the Iceberg REST catalog interface, the intention is
> > for it to leverage Iceberg for all the engine integrations to begin with.
> > Similarly, any table management or ingestion system that works with Iceberg
> > REST API will be able to be plugged in to directly work with Polaris. So
> > you could imagine it could be possible for an Iceberg table to be ingested
> > and managed by Amoro, but cataloged using Polaris.
> >
> > This does make Polaris more similar to Apache Gravitino. However, I think
> > the key difference between them is that the emphasis of Gravitino is more
> > breath-first on aspects like multi-format, multi-catalog, multi-datasource,
> > different data catalog objects in AI [3], etc. It exposes different sets of
> > APIs for different purposes, with Iceberg REST API being a part of it for
> > the Iceberg tables, and other APIs for other data sources [4]. Polaris is
> > more depth-first on Iceberg at this moment. Our future plan does say that
> > it could extend to non-Iceberg data lakes, and there could be some overlap
> > at that time. But even then, there could be different ways to achieve such
> > support. For example, we could surface Hive Parquet tables as Iceberg
> > tables, if the Iceberg REST catalog standard can be updated to accommodate
> > that. There could also be potential collaborations between Polaris and
> > Gravitino to achieve the goal together, and I am personally pretty excited
> > about that opportunity.
> >
> > Best,
> > Jack Ye
> >
> > [1] https://amoro.apache.org/docs/latest/spark-configuration/
> > [2] https://amoro.apache.org/docs/latest/trino/#mixed-format
> > [3]
> >
> > https://github.com/apache/gravitino-site/blob/10a967f18730c28018e064f3ee1ddd3cc32aa506/src/components/HomepageFeatures/index.tsx#L74
> > [4] https://github.com/apache/gravitino/tree/main/catalogs
> >
> > On Tue, Jul 30, 2024 at 10:06 PM Jean-Baptiste Onofré <j...@nanthrax.net>
> > wrote:
> >
> > > Hi Manu
> > >
> > > Thanks for the details !
> > > I agree with you. As mentor on Gravitino, I would be more than happy
> > > to connect the two podlings.
> > >
> > > Regards
> > > JB
> > >
> > > On Wed, Jul 31, 2024 at 7:00 AM Manu Zhang <owenzhang1...@gmail.com>
> > > wrote:
> > > >
> > > > AFAIK, Amoro is a management system with optimization service, catalog
> > > > service, etc. It has a built-in catalog but can also work with other
> > > > catalogs like Polaris.
> > > > I think Polaris is more comparable to Gravitino which entered the
> > > incubator
> > > > recently. It would be interesting to see how these two communities can
> > > > collaborate.
> > > >
> > > > Regards,
> > > > Manu
> > > >
> > > >
> > > > On Wed, Jul 31, 2024 at 12:36 PM Jean-Baptiste Onofré <j...@nanthrax.net
> > >
> > > > wrote:
> > > >
> > > > > Hi
> > > > >
> > > > > The proposal is more generic: today it's Apache Iceberg, but after
> > the
> > > > > discussions with the initial community we agreed it could make sense
> > > > > to address other use cases.
> > > > >
> > > > > I don't know Amoro in details, but I am happy to bridge the
> > > > > communities to work together.
> > > > >
> > > > > Regards
> > > > > JB
> > > > >
> > > > > On Wed, Jul 31, 2024 at 5:16 AM Xuanwo <xua...@apache.org> wrote:
> > > > > >
> > > > > > Hi, JB
> > > > > >
> > > > > > Thank you for starting this thread; it's great to see an increasing
> > > > > number of projects being developed around Iceberg.
> > > > > >
> > > > > > I have two questions:
> > > > > >
> > > > > > - The polaris github repo said it's "an open source catalog for
> > > Apache
> > > > > Iceberg", but the proposal changed into "a catalog for data lakes".
> > > Does it
> > > > > mean Polaris's scope has been changed?
> > > > > > - What's the difference between this project and Amoro:
> > > > > https://github.com/apache/amoro? How do these two communities
> > > collaborate?
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Jul 31, 2024, at 04:19, Dave Fisher wrote:
> > > > > > >> On Jul 30, 2024, at 11:34 AM, Jean-Baptiste Onofré <
> > > j...@nanthrax.net>
> > > > > wrote:
> > > > > > >>
> > > > > > >> Hi Dave,
> > > > > > >>
> > > > > > >> That's a good question. The main reason is because we wanted
> > > people
> > > > > > >> with Apache experience in the PPMC to mentor the committers and
> > > > > > >> contributors heading to PPMC as well.
> > > > > > >> Also, the initial committers worked closely with PPMC guidance
> > > > > > >> (explaining the ICLA, good practice, etc).
> > > > > > >> So, we wanted to have PPMC acting more as mentor (both
> > > technically but
> > > > > > >> also with their Apache experience) with committers.
> > > > > > >
> > > > > > > That makes sense. Are any of the proposed PPMC members also ASF
> > > Members
> > > > > > > and/or potentially future Mentors?
> > > > > > >
> > > > > > >> If it's problematic, we can start only with the PPMC group and
> > > invite
> > > > > > >> new committers/PPMC members during incubation period.
> > > > > > >
> > > > > > > No problem. It will actually provide the Mentors and later the
> > IPMC
> > > > > > > additional data to see if the PPMC is properly growing the PPMC
> > and
> > > > > > > Committer base.
> > > > > > >
> > > > > > > Best,
> > > > > > > Dave
> > > > > > >
> > > > > > >>
> > > > > > >> Regards
> > > > > > >> JB
> > > > > > >>
> > > > > > >> On Tue, Jul 30, 2024 at 8:19 PM Dave Fisher <w...@apache.org>
> > > wrote:
> > > > > > >>>
> > > > > > >>> Hi JB,
> > > > > > >>>
> > > > > > >>> An interesting project that looks pretty mature.
> > > > > > >>>
> > > > > > >>> I’m curious about the split between Initial PPMC and initial
> > > > > Committer. In the usual case a new podling will have all of the
> > Initial
> > > > > Committers on the PPMC. Can you tell us why this is not the case with
> > > > > Polaris?
> > > > > > >>>
> > > > > > >>> Best,
> > > > > > >>> Dave
> > > > > > >>>
> > > > > > >>>> On Jul 30, 2024, at 10:33 AM, Jean-Baptiste Onofré <
> > > j...@nanthrax.net>
> > > > > wrote:
> > > > > > >>>>
> > > > > > >>>> Hi folks,
> > > > > > >>>>
> > > > > > >>>> We would like to propose a new project to the ASF incubator:
> > > > > Polaris.
> > > > > > >>>>
> > > > > > >>>> Polaris is a catalog for data lakes. It provides new levels of
> > > > > choice,
> > > > > > >>>> flexibility and control over data, with full enterprise
> > > security and
> > > > > > >>>> Apache Iceberg interoperability across a multitude of engines
> > > and
> > > > > > >>>> infrastructure. Polaris builds on standards such as those
> > > created by
> > > > > > >>>> Apache Iceberg, providing the following benefits for the
> > > ecosystem:
> > > > > > >>>> * Multi-engine interoperability over a single copy of data,
> > > > > > >>>> eliminating the need for moving and copying data across
> > > different
> > > > > > >>>> engines and catalogs.
> > > > > > >>>> * An interoperable security model providing a unified
> > > authorization
> > > > > > >>>> layer independent from the engines processing analytical
> > tables.
> > > > > > >>>> * For multi-catalog scenarios, a unified catalog level view of
> > > data
> > > > > > >>>> across multiple catalogs via catalog notification
> > integrations.
> > > > > > >>>> * The ability to host Polaris Catalog on the infrastructure of
> > > your
> > > > > choice.
> > > > > > >>>>
> > > > > > >>>> Here is the proposal:
> > > > > > >>>>
> > > > >
> > https://cwiki.apache.org/confluence/display/INCUBATOR/PolarisProposal
> > > > > > >>>>
> > > > > > >>>> Comments and feedback are welcome.
> > > > > > >>>>
> > > > > > >>>> Thanks!
> > > > > > >>>> Regards
> > > > > > >>>> JB
> > > > > > >>>>
> > > > > > >>>>
> > > > > ---------------------------------------------------------------------
> > > > > > >>>> To unsubscribe, e-mail:
> > > general-unsubscr...@incubator.apache.org
> > > > > > >>>> For additional commands, e-mail:
> > > general-h...@incubator.apache.org
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > ---------------------------------------------------------------------
> > > > > > >>> To unsubscribe, e-mail:
> > general-unsubscr...@incubator.apache.org
> > > > > > >>> For additional commands, e-mail:
> > > general-h...@incubator.apache.org
> > > > > > >>>
> > > > > > >>
> > > > > > >>
> > > ---------------------------------------------------------------------
> > > > > > >> To unsubscribe, e-mail:
> > general-unsubscr...@incubator.apache.org
> > > > > > >> For additional commands, e-mail:
> > > general-h...@incubator.apache.org
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > >
> > > ---------------------------------------------------------------------
> > > > > > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > > > > > > For additional commands, e-mail:
> > general-h...@incubator.apache.org
> > > > > >
> > > > > > --
> > > > > > Xuanwo
> > > > > >
> > > > > > https://xuanwo.io/
> > > > > >
> > > > > >
> > ---------------------------------------------------------------------
> > > > > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > > > > > For additional commands, e-mail: general-h...@incubator.apache.org
> > > > > >
> > > > >
> > > > > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > > > > For additional commands, e-mail: general-h...@incubator.apache.org
> > > > >
> > > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > > For additional commands, e-mail: general-h...@incubator.apache.org
> > >
> > >
> >
> 
> 
> -- 
> Best
> 
> ConradJam
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Reply via email to