If C++ engines prefer not to depend on Iceberg Rust, I actually don't see a problem with having a separate C++ project even if it only contains Puffin readers/writers. The important part is to avoid multiple C++ writer/reader implementations in different engines.
There were concerns with having one repo for multiple languages when we discussed how to maintain the main project. We should re-evaluate if they still apply here. - Anton вт, 10 вер. 2024 р. о 06:08 Gabor Kaszab <gaborkas...@apache.org> пише: > Hey All, > > Thanks for all the answers! > > My initial thought here was to have an iceberg-cpp repo for the Puffin > reader/writer implementation in C++. But then I thought that there might be > only one thing within this cpp repo, that is the Puffin stuff, and once > implemented the repo won't have any more PRs coming in so it might be an > overkill to introduce a separate repo for this purpose. Unless there is a > need from other stakeholders to have a C++ implementation of the Iceberg > lib (similarly to Python and Rust) but this would lead to a different > conversation. > > Pulling in the Rust implementation into the C++ part of Impala: I don't > know, I have to do some research on this. For performance reasons I know > Impala prefers to have its own implementation of things (like Parquet > reader/writer), so I have to double-check if it's acceptable > performance-wise to pull in a Rust implementation of anything. I don't have > any experience of doing so, hence the hesitation. > > JB, you mean that we can create a sub-repo like iceberg-puffin, that would > hold separate implementations for different languages? For me this could > also work. Wondering what others think. > > Regards, > Gabor > > > On Fri, Aug 30, 2024 at 2:08 PM Jean-Baptiste Onofré <j...@nanthrax.net> > wrote: > >> Hi Gabor >> >> I like the idea, it sounds good to me. >> >> Imho, for "clarity", the best option would be to have a dedicated >> puffin repo with different language binding (at bit like it's done in >> Arrow). >> I think that adding to iceberg-rust could be an option, but not sure >> it would be obvious for developers. >> >> Regards >> JB >> >> On Thu, Aug 29, 2024 at 2:46 PM Gabor Kaszab <gaborkas...@apache.org> >> wrote: >> > >> > Hi Iceberg Community, >> > >> > With the V3 position delete proposal it came up that non-Java engines >> might have to implement a Puffin reader and writer themselves so that they >> can support the newly proposed position deletes. Impala would most probably >> require a C++ implementation and first of all I'm wondering whether there >> are other engines that require additional language implementation. >> > >> > My additional thought is that once we have an implementation in another >> language where should that live? I can do the C++ and that could live >> within Impala, but I think it could be useful for other engines too but I >> have no idea where such an implementation could live TBH. >> > >> > Would be nice to hear opinions on this! >> > Thanks, >> > Gabor >> >