There is a reference implementation in the project, in the CatalogHandlers
class. That implements REST requests using a catalog and returns REST
responses. I believe this is what Gravatno relies on and I mentioned it
above in the discussion about whether we should have a catalog service.

Catalog tests also use catalog handlers, but use a simple HTTP wrapper to
test the HTTP client. There is also a test class that accepts HTTP calls
directly and also runs JSON serialization on requests and responses.

So far, the Iceberg community has avoided maintaining a service. That
brings in a lot of complications. So far, we’ve preferred to remain focused
on providing a library that can be used to wire up something like a REST
catalog, but not provide a runtime service.

Ryan

On Thu, Feb 29, 2024 at 2:59 AM Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> Hi Ajantha,
>
> Thanks for sharing your thoughts.
>
> It makes sense for Gravitino to be a TLP (after the incubation period)
> because Gravitino is "more" than an Iceberg catalog. It implements the
> Iceberg REST Catalog API, but it's also a metadata catalog/repo with
> additional features.
>
> That said, I agree with what you said:
> 1. We have the openapi yaml in the Iceberg project, but no reference
> implementation in the project itself. I think REST Catalog is a good
> approach as a "central" Catalog API because any Iceberg engine/layer
> could use this API (even if written in Python, rust, go, whatever),
> and it allows new use cases (like easily move data from an engine to
> another as the catalog API would be the same).
> 2. From an ASF standpoint, I would not talk about "subproject" but
> more repositories. The reason is because in terms of governance, it's
> still the Iceberg project (PMC member or committer has the same
> permission on all repositories in the Iceberg project, it's not
> possible to have a committer only on iceberg-rust for instance.
> Generally speaking, we should limit the number of subprojects.
> 3. I think it would be fair to have REST Catalog resources (openapi
> yaml + a ref impl) in a iceberg-catalog repository.
> 4. However, It's important to have a more global discussion within the
> community about Iceberg 2.0 and the roadmap about catalogs: do we
> deprecate Iceberg Java Catalog API in favor of the REST Catalog API ?
> What do we do with the existing catalogs ? etc. I think it's a fair
> discussion to have for Iceberg 2.0.
>
> It's an important discussion, community driven.
>
> Regards
> JB
>
> On Thu, Feb 29, 2024 at 9:44 AM Ajantha Bhat <ajanthab...@gmail.com>
> wrote:
> >
> > I apologize for the delay in responding.
> >
> > I'm pleased to see the development of an open-source REST catalog
> implementation, and the potential transition of Gravitino to an ASF project
> is certainly promising.
> > But REST catalog server implementation will be a small part of Gravitino
> ASF project. Which has many other things along with the catalog?
> >
> > While I understand Iceberg's focus on the table format specification and
> its implementation,
> > I would like to propose the creation of a sub-project for the REST
> catalog server implementation under the Iceberg repository (similar to
> pyiceberg, iceberg-rust, etc.).
> > This suggestion is based on several reasons:
> >
> > Everytime we make a change to the REST spec, there is no reference
> implementation to refer to or modify it.
> > Many companies such as AWS, Apple, Tabular, and Datastrato are each
> implementing their own REST servers.
> > Consolidating efforts within a sub-project could lead to efficiency
> gains and potential collaboration opportunities.
> > From the perspective of open-source users, the absence of an open-source
> implementation for the REST catalog within Iceberg may be inconvenient or
> frustrating.
> >
> > I believe creating a dedicated sub-project would address these concerns
> and enhance the overall usability and collaborative nature of the Iceberg
> ecosystem.
> > I also think we can have a sub-project for kafka-connect and iceberg
> tools (delta converter, catalog migrator etc) as they need not have to
> depend on the Iceberg release cycle
> > and they are independent of table format spec.
> >
> > Let me know your thoughts on this. I can open a separate thread for
> discussion if required.
> >
> > - Ajantha
> >
> >
> > On Wed, Jan 31, 2024 at 5:32 AM Jack Ye <yezhao...@gmail.com> wrote:
> >>
> >> +1 for using test-jar!
> >>
> >> -Jack
> >>
> >> On Fri, Jan 26, 2024 at 10:48 AM Ryan Blue <b...@tabular.io> wrote:
> >>>
> >>> I think I'd be fine exposing this through a test Jar, but it seems to
> me that if we were to put it into a normal package it would turn into the
> situation we want to avoid. People would use it for unintended purposes and
> it would become a distraction.
> >>>
> >>> What do you think about using the tests Jar for this?
> >>>
> >>> On Thu, Jan 25, 2024 at 12:48 PM Jack Ye <yezhao...@gmail.com> wrote:
> >>>>
> >>>> Yes, sorry I did not make it clear, I also agree it is not the right
> direction to invest a lot of community effort. I am more talking about
> casual use cases like importing a server for unit tests outside Iceberg,
> running some local debugging, etc. I think it would be valuable to provide
> a server in Iceberg for that purpose, and maybe vend it as test utils.
> Thoughts?
> >>>>
> >>>> -Jack
> >>>>
> >>>> On Thu, Jan 25, 2024 at 11:35 AM Ryan Blue <b...@tabular.io> wrote:
> >>>>>
> >>>>> > I know we have the RESTCatalogAdapter and RESTCatalogSevlet for
> unit tests, and technically we have a very similar Jetty server
> implementation in TestRESTCatalog. Should we think about making those
> components out of the tests into an iceberg-rest-server module for this use
> case, and merge with the implementation that Gravitino has?
> >>>>>
> >>>>> I think that this would take the Iceberg project in the wrong
> direction. Iceberg has always been a library and I think it should continue
> to be. Concerns about runtime should be left to other projects that need to
> fit into existing infrastructure or skillsets of people maintaining them.
> The question of whether to use Jetty or Tomcat or whatever else is a
> serious consideration, as is how to monitor that application and send
> metrics. I think it would slow down the core purpose of Iceberg if we got
> distracted by these things.
> >>>>>
> >>>>> In fact, I think that this project shows that the library is getting
> the balance right: it is using `CatalogHandlers` for their intended
> purpose. It has opinions about how to run the actual HTTP service and
> people that agree can use it. Other people could use `CatalogHandlers` to
> build on a different foundation.
> >>>>>
> >>>>> On Thu, Jan 25, 2024 at 11:13 AM Jack Ye <yezhao...@gmail.com>
> wrote:
> >>>>>>
> >>>>>> Really cool project!
> >>>>>>
> >>>>>> I browsed a bit of the codebase, and see this implementation of the
> REST service backend:
> >>>>>> -
> https://github.com/datastrato/gravitino/blob/main/catalogs/catalog-lakehouse-iceberg/src/main/java/com/datastrato/gravitino/catalog/lakehouse/iceberg/IcebergRESTService.java#L39
> >>>>>> -
> https://github.com/datastrato/gravitino/blob/main/catalogs/catalog-lakehouse-iceberg/src/main/java/com/datastrato/gravitino/catalog/lakehouse/iceberg/ops/IcebergTableOps.java#L42-L51
> >>>>>>
> >>>>>>  Looks like it is initializing a Jetty server that uses
> CatalogHandlers to delegate the execution to a specific Java Catalog
> implementation.
> >>>>>>
> >>>>>> I think this is actually something that is lacking today in
> Iceberg, which is an easy way for users to start an actual REST HTTP server.
> >>>>>>
> >>>>>> I know we have the RESTCatalogAdapter and RESTCatalogSevlet for
> unit tests, and technically we have a very similar Jetty server
> implementation in TestRESTCatalog. Should we think about making those
> components out of the tests into an iceberg-rest-server module for this use
> case, and merge with the implementation that Gravitino has?
> >>>>>>
> >>>>>> Best,
> >>>>>> Jack Ye
> >>>>>>
> >>>>>> On Thu, Jan 25, 2024 at 10:47 AM Yufei Gu <flyrain...@gmail.com>
> wrote:
> >>>>>>>
> >>>>>>> Thanks Justin for the sharing.
> >>>>>>>
> >>>>>>> It's pretty cool to see an open source REST catalog implementation
> in action. Having dabbled a bit in the early development of Gravitino
> myself, I'm really excited about its potential with the Iceberg REST
> catalog.
> >>>>>>>
> >>>>>>> The idea of Gravitino moving to an ASF project is promising. It’ll
> surely boost its visibility and open up more doors for collaboration and
> adoption.
> >>>>>>>
> >>>>>>> Looking forward to where this goes. Keep up the fantastic work!
> >>>>>>>
> >>>>>>> Yufei
> >>>>>>>
> >>>>>>>
> >>>>>>> On Thu, Jan 25, 2024 at 5:55 AM Jean-Baptiste Onofré <
> j...@nanthrax.net> wrote:
> >>>>>>>>
> >>>>>>>> Hi Justin,
> >>>>>>>>
> >>>>>>>> I talked with Junping a couple of months ago about Gravitino.
> Thanks
> >>>>>>>> for sharing !
> >>>>>>>>
> >>>>>>>> Regards
> >>>>>>>> JB
> >>>>>>>>
> >>>>>>>> On Thu, Jan 25, 2024 at 12:15 AM Justin Mclean <
> jus...@classsoftware.com> wrote:
> >>>>>>>> >
> >>>>>>>> > Hi,
> >>>>>>>> >
> >>>>>>>> > We open-sourced a new project, Gravitino, in December and have
> been working on growing the community and adding new functionality. We plan
> to donate the project to the ASF this year. Gravitino is a unified metadata
> lake solution offering a unified approach to managing datasets from diverse
> sources and regions across multiple cloud platforms. Its core is an Iceberg
> REST catalog service implementation to manage Iceberg tables efficiently.
> >>>>>>>> >
> >>>>>>>> > If this sounds like something you would be interested in, then
> the following resources will help:
> >>>>>>>> > -  Blog post:
> https://datastrato.ai/blog/gravitino-iceberg-rest-catalog-service/
> >>>>>>>> > -  Gravitino documentation: https://datastrato.ai/docs/0.3.1/
> >>>>>>>> > -  Iceberg REST service documentation:
> https://datastrato.ai/docs/0.3.1/iceberg-rest-service
> >>>>>>>> >
> >>>>>>>> > We welcome any feedback and suggestions you have, and as
> always, all contributions are welcome. You can find the source code at
> https://github.com/datastrato/gravitino.
> >>>>>>>> >
> >>>>>>>> > Kind Regards,
> >>>>>>>> > Justin
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Ryan Blue
> >>>>> Tabular
> >>>
> >>>
> >>>
> >>> --
> >>> Ryan Blue
> >>> Tabular
>


-- 
Ryan Blue
Tabular

Reply via email to