Re: Gravitino an Iceberg REST catalog service

Jean-Baptiste Onofré Fri, 01 Mar 2024 22:30:38 -0800

Hi Jack,

I agree and it's my fault :)


You are right, we need a dev ml thread on this topic (I sent a message
a couple of weeks ago to start the discussion about Iceberg 2.0, the
idea was to start a thread on each topic). Definitely it has to be
discussed on the mailing list to include anyone in the community (we
can have meetings but need to send melting notes on the mailing list
to give a chance to anyone to participate).

Thanks !
Regards
JB

On Sat, Mar 2, 2024 at 12:47 AM Jack Ye <[email protected]> wrote:
>
> Could we have a separate devlist thread dedicated for this discussion? It is 
> a bit awkward to continue this critical Iceberg 2.0 catalog consolidation 
> topic under this Gravitino thread, although I understand it is related. At 
> least I have overlooked all these discussions until now, and I feel there are 
> many other people like me.
>
> -Jack
>
> On Fri, Mar 1, 2024 at 3:09 PM Ryan Blue <[email protected]> wrote:
>>
>> To clarify, what I meant was that Iceberg has, so far, avoided providing 
>> runtime services that are ready to be deployed and used. I still think that 
>> is a good choice, and I agree with the reasons that Renjie listed.
>>
>> I disagree that this is inconsistent. We don't supply any of the other 
>> services that catalog implementations use. As Fokko pointed out, the JDBC 
>> catalog client talks to a database, the Hive catalog talks to a Hive 
>> Metastore, and the Nessie catalog talks to a Nessie deployment. Similarly, 
>> the REST catalog is a client. We have CatalogHandlers that is a reference 
>> implementation for REST service and catalog logic.
>>
>> The main thing that we don't provide is a deployable runtime REST catalog 
>> service. We may choose to add one in order to make it easier to move to the 
>> REST client, but I'm not confident that is the right choice vs encouraging 
>> other projects.
>>
>> On Fri, Mar 1, 2024 at 4:43 AM Jean-Baptiste Onofré <[email protected]> 
>> wrote:
>>>
>>> Hey Brian
>>>
>>> Thanks for the summary ! Good one !
>>>
>>> I would just add the "REST ref impl" discussion.
>>>
>>> Regarding the anti patterns, I agree with the lists, imho, some are
>>> more "opinionated implementation", so definitely not in the API scope.
>>> +1
>>>
>>> Thanks again !
>>>
>>> Regards
>>> JB
>>>
>>>
>>> On Fri, Mar 1, 2024 at 1:15 PM Brian Olsen <[email protected]> wrote:
>>> >
>>> > My attempt to consolidate a list of goals, anti patterns , and impl 
>>> > details mentioned since this discussion was brought up at the last 
>>> > Iceberg sync. Tried to roughly capture who mentioned these things so we 
>>> > can follow up if needed. Hopefully this can serve as a basis for the 
>>> > design discussion.
>>> >
>>> > Goals:
>>> >
>>> > - Remove the initial burden of choice of which REST implementation from 
>>> > new users getting started with Iceberg (Russel S)
>>> > - Cut down on the supported catalogs that are no longer in use (e.g. 
>>> > DynamoDB) or never intended for production (e.g. Hadoop) to minimize 
>>> > maintenance lower variability, and lower the burden of choice on Iceberg 
>>> > users. (Blue)
>>> > - Simplify plugging in your own catalog so the Iceberg project isn’t 
>>> > responsible for maintaining and testing a bunch of dialects. (Blue).
>>> > - Aim for a REST catalog centric future and continue to remove Iceberg 
>>> > support where it makes sense. (Russell/Jack Ye/Blue)
>>> > - Use this as a test dependency for the Iceberg project (Jack/Russell)
>>> > - Make this an MVP production grade catalog, assuming that whatever we do 
>>> > put out there will end up being used as production anyways. (Blue/Dan)
>>> > - Keep the responsibilities the REST implementation as light as possible. 
>>> > (Blue)
>>> > - Support HTTP(S) protocol, the service will act as a load balancer + 
>>> > proxy to the JDBC backend. (Blue)
>>> > - Container image + k8s installation (Blue)
>>> > - Use for Iceberg education and evaluation (Bits)
>>> > - Use as a blueprint for designing you own Implementation (JB)
>>> >
>>> > Anti patterns:
>>> >
>>> > - Avoid becoming the Hive Metastore project, where we support every use 
>>> > case.
>>> > - Don’t support data governance cases like lineage. (Dan)
>>> > - Don’t support metrics reporting. (Blue/Dan)
>>> > - Don’t support security. (Blue)
>>> > - Don’t support a wide range of protocols outside of HTTP(S) (Dan)
>>> > - In general, avoid spending time integrating with whatever runtime a 
>>> > given company uses that removes focus from the core project goals and 
>>> > spec. (Dan)
>>> > - Don’t be overly opinionated with tool choices. (Dan)
>>> >
>>> > Implementation ideas:
>>> >
>>> > - apache/iceberg-catalog repository, with all of the catalog impls moved 
>>> > and maintained there as well. (Blue/Dan/Jack/JB/Russel)
>>> > - A catalog implementation per JDBC backend. (Blue)
>>> > - Servlet like Tomcat or Spring to run / package the service. (Blue)
>>> >
>>> > On Fri, Mar 1, 2024 at 2:54 AM Jean-Baptiste Onofré <[email protected]> 
>>> > wrote:
>>> >>
>>> >> Hi Renjie,
>>> >>
>>> >> maybe I wasn't clear, sorry about that: the target is really both ref
>>> >> impl (where we can test different Iceberg parts like we do with the
>>> >> InMemoryCatalog, JdbcCatalog, etc) and ready to go service for users
>>> >> (simple but to start with).
>>> >>
>>> >> But we can't prevent the community from working on a production grade
>>> >> catalog. The point is: if it's not in Iceberg, then it gonna be
>>> >> elsewhere (another ASF project, vendor project, whatever). This is OK
>>> >> as soon as we have a reference implementation in Iceberg. That's the
>>> >> min we should guarantee imho.
>>> >> For instance, for the JAXRS spec, the ref implementation is CXF-RS,
>>> >> but there are other implementation. The same for OSGi Blueprint, the
>>> >> ref implementation is in Apache Aries (aries-blueprint).
>>> >>
>>> >> My proposal is really a simple ref imp in Iceberg (submodule or
>>> >> separate repo, both are OK for me even if I have a preference for
>>> >> separate repo to keep things clean and different lifecycle as we do
>>> >> for iceberg-rust or iceberg-python),
>>> >>
>>> >> That said, I don't see why we could not have iceberg-catalog repo with
>>> >> a ref impl that evolves to something production ready. Observability,
>>> >> scaling, pluggable backend, etc can be implemented there and it would
>>> >> be a great addition for Iceberg with new contributors from the
>>> >> community I'm sure. Separated repo would make this doable imho,
>>> >> Iceberg still focus on spec.
>>> >>
>>> >> Regards
>>> >> JB
>>> >>
>>> >> On Fri, Mar 1, 2024 at 9:24 AM Renjie Liu <[email protected]> 
>>> >> wrote:
>>> >> >
>>> >> > Hi:
>>> >> >
>>> >> > I think one thing missing in the discussion is that, if the iceberg 
>>> >> > community wants to maintain a rest catalog service, what's the target 
>>> >> > use case? Different target use cases may lead to different directions.
>>> >> >
>>> >> > If it's mainly designed for first time users to play or experience 
>>> >> > with rest catalog, then maybe we just need a submodule in java repo or 
>>> >> > a test-jar would be enough.
>>> >> >
>>> >> > If it's targeted toward production usage, things get complicated. 
>>> >> > There are too many things to think about, such as using different 
>>> >> > storage backend, monitoring, ha, scalability etc. What's more, in an 
>>> >> > enterprise iceberg rest catalog usually is only part of a data 
>>> >> > platform, there are many other things involved. In this case, I'm 
>>> >> > skeptical about the actual value of a rest catalog server, and a spec 
>>> >> > or a library would be more valuable.
>>> >> >
>>> >> > On Fri, Mar 1, 2024 at 3:49 PM Jean-Baptiste Onofré 
>>> >> > <[email protected]> wrote:
>>> >> >>
>>> >> >> Hi Fokko
>>> >> >>
>>> >> >> If service means the actual runtime service, I partially agree.
>>> >> >>
>>> >> >> I would love to see REST Catalog API the "central cornerstone" used in
>>> >> >> iceberg-java, pyiceberg, etc. So I think we should provide the
>>> >> >> resources for an user to bootstrap a REST Catalog ref impl.
>>> >> >> A lot of Apache projects provides both specs and runtime (for some
>>> >> >> part): Apache Camel, Apache ActiveMQ, Apache Karaf, Apache Kafka, ...
>>> >> >> That's why it would make sense to have it in a separate Iceberg repo
>>> >> >> (iceberg-catalog) to keep iceberg main repo focus on spec.
>>> >> >> Iceberg would need both spec and simple runtime for ref impl. It would
>>> >> >> be a bit "hypocrite" (to our users :)) to say we have the spec but not
>>> >> >> impl. It's like you have Iceberg spec but no Spark or Flink
>>> >> >> extensions.
>>> >> >> Imagine Apache ActiveMQ says we have JMS 3.0 support but no 
>>> >> >> runtime/service :)
>>> >> >>
>>> >> >> That's my $0.02, but if we want to promote the REST Catalog (and I
>>> >> >> think it's a good approach), then Iceberg should provide a ref impl
>>> >> >> ready to run (without preventing other impl of course).
>>> >> >>
>>> >> >> Regards
>>> >> >> JB
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> On Fri, Mar 1, 2024 at 8:13 AM Fokko Driesprong <[email protected]> 
>>> >> >> wrote:
>>> >> >> >
>>> >> >> > Hey everyone,
>>> >> >> >
>>> >> >> > Thanks for raising this. I think a test-jar would be a great first 
>>> >> >> > step.
>>> >> >> >
>>> >> >> > We already maintain "service" considering JDBC, Hive, etc catalogs. 
>>> >> >> > REST Catalog ref impl in Iceberg would be the sam.
>>> >> >> >
>>> >> >> >
>>> >> >> > What I think Ryan means by a service is having to maintain Postgres 
>>> >> >> > (JDBC backend), Hive Metastore (Hive backend), etc. There is a lot 
>>> >> >> > to it to properly scale these backends.
>>> >> >> >
>>> >> >> > For PyIceberg we decided to build the examples backed by the 
>>> >> >> > SqlCatalog. This can be both in memory or on a local dist (sqlite), 
>>> >> >> > of course, it has limited parallelism, but makes it easy to give 
>>> >> >> > Iceberg a try. One of the main motivations for doing it this way 
>>> >> >> > was that it doesn't require any additional services. Running 
>>> >> >> > additional services would require having JRE/Docker/etc being 
>>> >> >> > installed and potentially also an RDBMS backend to persist the data.
>>> >> >> >
>>> >> >> > Kind regards,
>>> >> >> > Fokko
>>> >> >> >
>>> >> >> >
>>> >> >> > Op vr 1 mrt 2024 om 07:34 schreef Jean-Baptiste Onofré 
>>> >> >> > <[email protected]>:
>>> >> >> >>
>>> >> >> >> Hi Ryan
>>> >> >> >>
>>> >> >> >> If we plan to reduce the number of catalogs (and I think it makes
>>> >> >> >> sense and I'm with you on that), we will need a impl/service in
>>> >> >> >> Iceberg for the REST Catalog API, else the users won't be able to 
>>> >> >> >> use
>>> >> >> >> Iceberg "out of the box".
>>> >> >> >> We already maintain "service" considering JDBC, Hive, etc catalogs.
>>> >> >> >> REST Catalog ref impl in Iceberg would be the sam.
>>> >> >> >>
>>> >> >> >> So, in order to promote the REST Catalog API as the Catalog 
>>> >> >> >> "unique"
>>> >> >> >> façade for Iceberg, I would be in favor of having a simple REST
>>> >> >> >> service in Iceberg.
>>> >> >> >> It would be the entry point for Iceberg users and they can use 
>>> >> >> >> other
>>> >> >> >> REST catalogs depending on their needs (Gravitno, Tabular, ...).
>>> >> >> >>
>>> >> >> >> Regards
>>> >> >> >> JB
>>> >> >> >>
>>> >> >> >> On Fri, Mar 1, 2024 at 1:28 AM Ryan Blue <[email protected]> wrote:
>>> >> >> >> >
>>> >> >> >> > There is a reference implementation in the project, in the 
>>> >> >> >> > CatalogHandlers class. That implements REST requests using a 
>>> >> >> >> > catalog and returns REST responses. I believe this is what 
>>> >> >> >> > Gravatno relies on and I mentioned it above in the discussion 
>>> >> >> >> > about whether we should have a catalog service.
>>> >> >> >> >
>>> >> >> >> > Catalog tests also use catalog handlers, but use a simple HTTP 
>>> >> >> >> > wrapper to test the HTTP client. There is also a test class that 
>>> >> >> >> > accepts HTTP calls directly and also runs JSON serialization on 
>>> >> >> >> > requests and responses.
>>> >> >> >> >
>>> >> >> >> > So far, the Iceberg community has avoided maintaining a service. 
>>> >> >> >> > That brings in a lot of complications. So far, we’ve preferred 
>>> >> >> >> > to remain focused on providing a library that can be used to 
>>> >> >> >> > wire up something like a REST catalog, but not provide a runtime 
>>> >> >> >> > service.
>>> >> >> >> >
>>> >> >> >> > Ryan
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > On Thu, Feb 29, 2024 at 2:59 AM Jean-Baptiste Onofré 
>>> >> >> >> > <[email protected]> wrote:
>>> >> >> >> >>
>>> >> >> >> >> Hi Ajantha,
>>> >> >> >> >>
>>> >> >> >> >> Thanks for sharing your thoughts.
>>> >> >> >> >>
>>> >> >> >> >> It makes sense for Gravitino to be a TLP (after the incubation 
>>> >> >> >> >> period)
>>> >> >> >> >> because Gravitino is "more" than an Iceberg catalog. It 
>>> >> >> >> >> implements the
>>> >> >> >> >> Iceberg REST Catalog API, but it's also a metadata catalog/repo 
>>> >> >> >> >> with
>>> >> >> >> >> additional features.
>>> >> >> >> >>
>>> >> >> >> >> That said, I agree with what you said:
>>> >> >> >> >> 1. We have the openapi yaml in the Iceberg project, but no 
>>> >> >> >> >> reference
>>> >> >> >> >> implementation in the project itself. I think REST Catalog is a 
>>> >> >> >> >> good
>>> >> >> >> >> approach as a "central" Catalog API because any Iceberg 
>>> >> >> >> >> engine/layer
>>> >> >> >> >> could use this API (even if written in Python, rust, go, 
>>> >> >> >> >> whatever),
>>> >> >> >> >> and it allows new use cases (like easily move data from an 
>>> >> >> >> >> engine to
>>> >> >> >> >> another as the catalog API would be the same).
>>> >> >> >> >> 2. From an ASF standpoint, I would not talk about "subproject" 
>>> >> >> >> >> but
>>> >> >> >> >> more repositories. The reason is because in terms of 
>>> >> >> >> >> governance, it's
>>> >> >> >> >> still the Iceberg project (PMC member or committer has the same
>>> >> >> >> >> permission on all repositories in the Iceberg project, it's not
>>> >> >> >> >> possible to have a committer only on iceberg-rust for instance.
>>> >> >> >> >> Generally speaking, we should limit the number of subprojects.
>>> >> >> >> >> 3. I think it would be fair to have REST Catalog resources 
>>> >> >> >> >> (openapi
>>> >> >> >> >> yaml + a ref impl) in a iceberg-catalog repository.
>>> >> >> >> >> 4. However, It's important to have a more global discussion 
>>> >> >> >> >> within the
>>> >> >> >> >> community about Iceberg 2.0 and the roadmap about catalogs: do 
>>> >> >> >> >> we
>>> >> >> >> >> deprecate Iceberg Java Catalog API in favor of the REST Catalog 
>>> >> >> >> >> API ?
>>> >> >> >> >> What do we do with the existing catalogs ? etc. I think it's a 
>>> >> >> >> >> fair
>>> >> >> >> >> discussion to have for Iceberg 2.0.
>>> >> >> >> >>
>>> >> >> >> >> It's an important discussion, community driven.
>>> >> >> >> >>
>>> >> >> >> >> Regards
>>> >> >> >> >> JB
>>> >> >> >> >>
>>> >> >> >> >> On Thu, Feb 29, 2024 at 9:44 AM Ajantha Bhat 
>>> >> >> >> >> <[email protected]> wrote:
>>> >> >> >> >> >
>>> >> >> >> >> > I apologize for the delay in responding.
>>> >> >> >> >> >
>>> >> >> >> >> > I'm pleased to see the development of an open-source REST 
>>> >> >> >> >> > catalog implementation, and the potential transition of 
>>> >> >> >> >> > Gravitino to an ASF project is certainly promising.
>>> >> >> >> >> > But REST catalog server implementation will be a small part 
>>> >> >> >> >> > of Gravitino ASF project. Which has many other things along 
>>> >> >> >> >> > with the catalog?
>>> >> >> >> >> >
>>> >> >> >> >> > While I understand Iceberg's focus on the table format 
>>> >> >> >> >> > specification and its implementation,
>>> >> >> >> >> > I would like to propose the creation of a sub-project for the 
>>> >> >> >> >> > REST catalog server implementation under the Iceberg 
>>> >> >> >> >> > repository (similar to pyiceberg, iceberg-rust, etc.).
>>> >> >> >> >> > This suggestion is based on several reasons:
>>> >> >> >> >> >
>>> >> >> >> >> > Everytime we make a change to the REST spec, there is no 
>>> >> >> >> >> > reference implementation to refer to or modify it.
>>> >> >> >> >> > Many companies such as AWS, Apple, Tabular, and Datastrato 
>>> >> >> >> >> > are each implementing their own REST servers.
>>> >> >> >> >> > Consolidating efforts within a sub-project could lead to 
>>> >> >> >> >> > efficiency gains and potential collaboration opportunities.
>>> >> >> >> >> > From the perspective of open-source users, the absence of an 
>>> >> >> >> >> > open-source implementation for the REST catalog within 
>>> >> >> >> >> > Iceberg may be inconvenient or frustrating.
>>> >> >> >> >> >
>>> >> >> >> >> > I believe creating a dedicated sub-project would address 
>>> >> >> >> >> > these concerns and enhance the overall usability and 
>>> >> >> >> >> > collaborative nature of the Iceberg ecosystem.
>>> >> >> >> >> > I also think we can have a sub-project for kafka-connect and 
>>> >> >> >> >> > iceberg tools (delta converter, catalog migrator etc) as they 
>>> >> >> >> >> > need not have to depend on the Iceberg release cycle
>>> >> >> >> >> > and they are independent of table format spec.
>>> >> >> >> >> >
>>> >> >> >> >> > Let me know your thoughts on this. I can open a separate 
>>> >> >> >> >> > thread for discussion if required.
>>> >> >> >> >> >
>>> >> >> >> >> > - Ajantha
>>> >> >> >> >> >
>>> >> >> >> >> >
>>> >> >> >> >> > On Wed, Jan 31, 2024 at 5:32 AM Jack Ye <[email protected]> 
>>> >> >> >> >> > wrote:
>>> >> >> >> >> >>
>>> >> >> >> >> >> +1 for using test-jar!
>>> >> >> >> >> >>
>>> >> >> >> >> >> -Jack
>>> >> >> >> >> >>
>>> >> >> >> >> >> On Fri, Jan 26, 2024 at 10:48 AM Ryan Blue <[email protected]> 
>>> >> >> >> >> >> wrote:
>>> >> >> >> >> >>>
>>> >> >> >> >> >>> I think I'd be fine exposing this through a test Jar, but 
>>> >> >> >> >> >>> it seems to me that if we were to put it into a normal 
>>> >> >> >> >> >>> package it would turn into the situation we want to avoid. 
>>> >> >> >> >> >>> People would use it for unintended purposes and it would 
>>> >> >> >> >> >>> become a distraction.
>>> >> >> >> >> >>>
>>> >> >> >> >> >>> What do you think about using the tests Jar for this?
>>> >> >> >> >> >>>
>>> >> >> >> >> >>> On Thu, Jan 25, 2024 at 12:48 PM Jack Ye 
>>> >> >> >> >> >>> <[email protected]> wrote:
>>> >> >> >> >> >>>>
>>> >> >> >> >> >>>> Yes, sorry I did not make it clear, I also agree it is not 
>>> >> >> >> >> >>>> the right direction to invest a lot of community effort. I 
>>> >> >> >> >> >>>> am more talking about casual use cases like importing a 
>>> >> >> >> >> >>>> server for unit tests outside Iceberg, running some local 
>>> >> >> >> >> >>>> debugging, etc. I think it would be valuable to provide a 
>>> >> >> >> >> >>>> server in Iceberg for that purpose, and maybe vend it as 
>>> >> >> >> >> >>>> test utils. Thoughts?
>>> >> >> >> >> >>>>
>>> >> >> >> >> >>>> -Jack
>>> >> >> >> >> >>>>
>>> >> >> >> >> >>>> On Thu, Jan 25, 2024 at 11:35 AM Ryan Blue 
>>> >> >> >> >> >>>> <[email protected]> wrote:
>>> >> >> >> >> >>>>>
>>> >> >> >> >> >>>>> > I know we have the RESTCatalogAdapter and 
>>> >> >> >> >> >>>>> > RESTCatalogSevlet for unit tests, and technically we 
>>> >> >> >> >> >>>>> > have a very similar Jetty server implementation in 
>>> >> >> >> >> >>>>> > TestRESTCatalog. Should we think about making those 
>>> >> >> >> >> >>>>> > components out of the tests into an iceberg-rest-server 
>>> >> >> >> >> >>>>> > module for this use case, and merge with the 
>>> >> >> >> >> >>>>> > implementation that Gravitino has?
>>> >> >> >> >> >>>>>
>>> >> >> >> >> >>>>> I think that this would take the Iceberg project in the 
>>> >> >> >> >> >>>>> wrong direction. Iceberg has always been a library and I 
>>> >> >> >> >> >>>>> think it should continue to be. Concerns about runtime 
>>> >> >> >> >> >>>>> should be left to other projects that need to fit into 
>>> >> >> >> >> >>>>> existing infrastructure or skillsets of people 
>>> >> >> >> >> >>>>> maintaining them. The question of whether to use Jetty or 
>>> >> >> >> >> >>>>> Tomcat or whatever else is a serious consideration, as is 
>>> >> >> >> >> >>>>> how to monitor that application and send metrics. I think 
>>> >> >> >> >> >>>>> it would slow down the core purpose of Iceberg if we got 
>>> >> >> >> >> >>>>> distracted by these things.
>>> >> >> >> >> >>>>>
>>> >> >> >> >> >>>>> In fact, I think that this project shows that the library 
>>> >> >> >> >> >>>>> is getting the balance right: it is using 
>>> >> >> >> >> >>>>> `CatalogHandlers` for their intended purpose. It has 
>>> >> >> >> >> >>>>> opinions about how to run the actual HTTP service and 
>>> >> >> >> >> >>>>> people that agree can use it. Other people could use 
>>> >> >> >> >> >>>>> `CatalogHandlers` to build on a different foundation.
>>> >> >> >> >> >>>>>
>>> >> >> >> >> >>>>> On Thu, Jan 25, 2024 at 11:13 AM Jack Ye 
>>> >> >> >> >> >>>>> <[email protected]> wrote:
>>> >> >> >> >> >>>>>>
>>> >> >> >> >> >>>>>> Really cool project!
>>> >> >> >> >> >>>>>>
>>> >> >> >> >> >>>>>> I browsed a bit of the codebase, and see this 
>>> >> >> >> >> >>>>>> implementation of the REST service backend:
>>> >> >> >> >> >>>>>> - 
>>> >> >> >> >> >>>>>> https://github.com/datastrato/gravitino/blob/main/catalogs/catalog-lakehouse-iceberg/src/main/java/com/datastrato/gravitino/catalog/lakehouse/iceberg/IcebergRESTService.java#L39
>>> >> >> >> >> >>>>>> - 
>>> >> >> >> >> >>>>>> https://github.com/datastrato/gravitino/blob/main/catalogs/catalog-lakehouse-iceberg/src/main/java/com/datastrato/gravitino/catalog/lakehouse/iceberg/ops/IcebergTableOps.java#L42-L51
>>> >> >> >> >> >>>>>>
>>> >> >> >> >> >>>>>>  Looks like it is initializing a Jetty server that uses 
>>> >> >> >> >> >>>>>> CatalogHandlers to delegate the execution to a specific 
>>> >> >> >> >> >>>>>> Java Catalog implementation.
>>> >> >> >> >> >>>>>>
>>> >> >> >> >> >>>>>> I think this is actually something that is lacking today 
>>> >> >> >> >> >>>>>> in Iceberg, which is an easy way for users to start an 
>>> >> >> >> >> >>>>>> actual REST HTTP server.
>>> >> >> >> >> >>>>>>
>>> >> >> >> >> >>>>>> I know we have the RESTCatalogAdapter and 
>>> >> >> >> >> >>>>>> RESTCatalogSevlet for unit tests, and technically we 
>>> >> >> >> >> >>>>>> have a very similar Jetty server implementation in 
>>> >> >> >> >> >>>>>> TestRESTCatalog. Should we think about making those 
>>> >> >> >> >> >>>>>> components out of the tests into an iceberg-rest-server 
>>> >> >> >> >> >>>>>> module for this use case, and merge with the 
>>> >> >> >> >> >>>>>> implementation that Gravitino has?
>>> >> >> >> >> >>>>>>
>>> >> >> >> >> >>>>>> Best,
>>> >> >> >> >> >>>>>> Jack Ye
>>> >> >> >> >> >>>>>>
>>> >> >> >> >> >>>>>> On Thu, Jan 25, 2024 at 10:47 AM Yufei Gu 
>>> >> >> >> >> >>>>>> <[email protected]> wrote:
>>> >> >> >> >> >>>>>>>
>>> >> >> >> >> >>>>>>> Thanks Justin for the sharing.
>>> >> >> >> >> >>>>>>>
>>> >> >> >> >> >>>>>>> It's pretty cool to see an open source REST catalog 
>>> >> >> >> >> >>>>>>> implementation in action. Having dabbled a bit in the 
>>> >> >> >> >> >>>>>>> early development of Gravitino myself, I'm really 
>>> >> >> >> >> >>>>>>> excited about its potential with the Iceberg REST 
>>> >> >> >> >> >>>>>>> catalog.
>>> >> >> >> >> >>>>>>>
>>> >> >> >> >> >>>>>>> The idea of Gravitino moving to an ASF project is 
>>> >> >> >> >> >>>>>>> promising. It’ll surely boost its visibility and open 
>>> >> >> >> >> >>>>>>> up more doors for collaboration and adoption.
>>> >> >> >> >> >>>>>>>
>>> >> >> >> >> >>>>>>> Looking forward to where this goes. Keep up the 
>>> >> >> >> >> >>>>>>> fantastic work!
>>> >> >> >> >> >>>>>>>
>>> >> >> >> >> >>>>>>> Yufei
>>> >> >> >> >> >>>>>>>
>>> >> >> >> >> >>>>>>>
>>> >> >> >> >> >>>>>>> On Thu, Jan 25, 2024 at 5:55 AM Jean-Baptiste Onofré 
>>> >> >> >> >> >>>>>>> <[email protected]> wrote:
>>> >> >> >> >> >>>>>>>>
>>> >> >> >> >> >>>>>>>> Hi Justin,
>>> >> >> >> >> >>>>>>>>
>>> >> >> >> >> >>>>>>>> I talked with Junping a couple of months ago about 
>>> >> >> >> >> >>>>>>>> Gravitino. Thanks
>>> >> >> >> >> >>>>>>>> for sharing !
>>> >> >> >> >> >>>>>>>>
>>> >> >> >> >> >>>>>>>> Regards
>>> >> >> >> >> >>>>>>>> JB
>>> >> >> >> >> >>>>>>>>
>>> >> >> >> >> >>>>>>>> On Thu, Jan 25, 2024 at 12:15 AM Justin Mclean 
>>> >> >> >> >> >>>>>>>> <[email protected]> wrote:
>>> >> >> >> >> >>>>>>>> >
>>> >> >> >> >> >>>>>>>> > Hi,
>>> >> >> >> >> >>>>>>>> >
>>> >> >> >> >> >>>>>>>> > We open-sourced a new project, Gravitino, in 
>>> >> >> >> >> >>>>>>>> > December and have been working on growing the 
>>> >> >> >> >> >>>>>>>> > community and adding new functionality. We plan to 
>>> >> >> >> >> >>>>>>>> > donate the project to the ASF this year. Gravitino 
>>> >> >> >> >> >>>>>>>> > is a unified metadata lake solution offering a 
>>> >> >> >> >> >>>>>>>> > unified approach to managing datasets from diverse 
>>> >> >> >> >> >>>>>>>> > sources and regions across multiple cloud platforms. 
>>> >> >> >> >> >>>>>>>> > Its core is an Iceberg REST catalog service 
>>> >> >> >> >> >>>>>>>> > implementation to manage Iceberg tables efficiently.
>>> >> >> >> >> >>>>>>>> >
>>> >> >> >> >> >>>>>>>> > If this sounds like something you would be 
>>> >> >> >> >> >>>>>>>> > interested in, then the following resources will 
>>> >> >> >> >> >>>>>>>> > help:
>>> >> >> >> >> >>>>>>>> > -  Blog post: 
>>> >> >> >> >> >>>>>>>> > https://datastrato.ai/blog/gravitino-iceberg-rest-catalog-service/
>>> >> >> >> >> >>>>>>>> > -  Gravitino documentation: 
>>> >> >> >> >> >>>>>>>> > https://datastrato.ai/docs/0.3.1/
>>> >> >> >> >> >>>>>>>> > -  Iceberg REST service documentation: 
>>> >> >> >> >> >>>>>>>> > https://datastrato.ai/docs/0.3.1/iceberg-rest-service
>>> >> >> >> >> >>>>>>>> >
>>> >> >> >> >> >>>>>>>> > We welcome any feedback and suggestions you have, 
>>> >> >> >> >> >>>>>>>> > and as always, all contributions are welcome. You 
>>> >> >> >> >> >>>>>>>> > can find the source code at 
>>> >> >> >> >> >>>>>>>> > https://github.com/datastrato/gravitino.
>>> >> >> >> >> >>>>>>>> >
>>> >> >> >> >> >>>>>>>> > Kind Regards,
>>> >> >> >> >> >>>>>>>> > Justin
>>> >> >> >> >> >>>>>
>>> >> >> >> >> >>>>>
>>> >> >> >> >> >>>>>
>>> >> >> >> >> >>>>> --
>>> >> >> >> >> >>>>> Ryan Blue
>>> >> >> >> >> >>>>> Tabular
>>> >> >> >> >> >>>
>>> >> >> >> >> >>>
>>> >> >> >> >> >>>
>>> >> >> >> >> >>> --
>>> >> >> >> >> >>> Ryan Blue
>>> >> >> >> >> >>> Tabular
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > --
>>> >> >> >> > Ryan Blue
>>> >> >> >> > Tabular
>>
>>
>>
>> --
>> Ryan Blue
>> Tabular

Re: Gravitino an Iceberg REST catalog service

Reply via email to