Hi Gabor, I think it makes sense to create iceberg-cpp resources (repository, slack channel, ...): this can gather the efforts in the Iceberg lib and Puffin implementation.
Fokko can help there, he can ping me if needed (from ASF standpoint) :) Regards JB On Fri, Nov 22, 2024 at 9:25 AM Gabor Kaszab <gaborkas...@apache.org> wrote: > > Hi Iceberg Community, > > It's been a while since we started this discussion. I'd like to revive the > conversion for two reasons: > 1) I think I'll have some capacity starting from early next year to take care > of the C++ Puffin stuff we already talked about above, and also from the > Impala community we could add some additional auxiliary functionality for the > V3 positional deletes later on. > 2) I learned that a part of the community is interested in having a C++ > implementation of the Iceberg lib in general for their C++ engine. cc @Gang Wu > > There seemed to be general support from the community to start up such a > sub-project, so I'm reaching out now to ask for some guidance so that we can > get going. @Fokko Driesprong You have much experience in this area, do you > think you have the bandwidth to handhold us with the process and the steps? > Meanwhile, I'll take a look at the Python repo to get a feel for myself. > > Regards, > Gabor > > On Wed, Sep 11, 2024 at 5:32 PM Anton Okolnychyi <aokolnyc...@gmail.com> > wrote: >> >> If C++ engines prefer not to depend on Iceberg Rust, I actually don't see a >> problem with having a separate C++ project even if it only contains Puffin >> readers/writers. The important part is to avoid multiple C++ writer/reader >> implementations in different engines. >> >> There were concerns with having one repo for multiple languages when we >> discussed how to maintain the main project. We should re-evaluate if they >> still apply here. >> >> - Anton >> >> вт, 10 вер. 2024 р. о 06:08 Gabor Kaszab <gaborkas...@apache.org> пише: >>> >>> Hey All, >>> >>> Thanks for all the answers! >>> >>> My initial thought here was to have an iceberg-cpp repo for the Puffin >>> reader/writer implementation in C++. But then I thought that there might be >>> only one thing within this cpp repo, that is the Puffin stuff, and once >>> implemented the repo won't have any more PRs coming in so it might be an >>> overkill to introduce a separate repo for this purpose. Unless there is a >>> need from other stakeholders to have a C++ implementation of the Iceberg >>> lib (similarly to Python and Rust) but this would lead to a different >>> conversation. >>> >>> Pulling in the Rust implementation into the C++ part of Impala: I don't >>> know, I have to do some research on this. For performance reasons I know >>> Impala prefers to have its own implementation of things (like Parquet >>> reader/writer), so I have to double-check if it's acceptable >>> performance-wise to pull in a Rust implementation of anything. I don't have >>> any experience of doing so, hence the hesitation. >>> >>> JB, you mean that we can create a sub-repo like iceberg-puffin, that would >>> hold separate implementations for different languages? For me this could >>> also work. Wondering what others think. >>> >>> Regards, >>> Gabor >>> >>> >>> On Fri, Aug 30, 2024 at 2:08 PM Jean-Baptiste Onofré <j...@nanthrax.net> >>> wrote: >>>> >>>> Hi Gabor >>>> >>>> I like the idea, it sounds good to me. >>>> >>>> Imho, for "clarity", the best option would be to have a dedicated >>>> puffin repo with different language binding (at bit like it's done in >>>> Arrow). >>>> I think that adding to iceberg-rust could be an option, but not sure >>>> it would be obvious for developers. >>>> >>>> Regards >>>> JB >>>> >>>> On Thu, Aug 29, 2024 at 2:46 PM Gabor Kaszab <gaborkas...@apache.org> >>>> wrote: >>>> > >>>> > Hi Iceberg Community, >>>> > >>>> > With the V3 position delete proposal it came up that non-Java engines >>>> > might have to implement a Puffin reader and writer themselves so that >>>> > they can support the newly proposed position deletes. Impala would most >>>> > probably require a C++ implementation and first of all I'm wondering >>>> > whether there are other engines that require additional language >>>> > implementation. >>>> > >>>> > My additional thought is that once we have an implementation in another >>>> > language where should that live? I can do the C++ and that could live >>>> > within Impala, but I think it could be useful for other engines too but >>>> > I have no idea where such an implementation could live TBH. >>>> > >>>> > Would be nice to hear opinions on this! >>>> > Thanks, >>>> > Gabor