I definitely support the idea of having Puffin readers/writers for different languages managed by the community. It would be really unfortunate for engines to re-implement this logic. Like Piotr said, it is not only for V3 position deletes but also for table stats supported today.
I think it is reasonable to start a new C++ sub-project even if it will only contain Puffin writers/readers initially. Are there any alternatives? We could add a separate repo for Puffin but we previously concluded it is not a good idea to support multiple languages in one repo. - Anton чт, 29 серп. 2024 р. о 09:04 Gang Wu <ust...@gmail.com> пише: > Hi, > > It won't be an issue if there is already an iceberg-cpp implementation. > However, it is unfortunate to see duplicate efforts from different query > engines to implement their own C++ Iceberg reader and writers. Is it a good > chance to add official C++ implementation by providing a puffin > reader/writer? It can be a good starting point, IMHO. > > Best, > Gang > > On Thu, Aug 29, 2024 at 11:09 PM Piotr Findeisen < > piotr.findei...@gmail.com> wrote: > >> Hi Gabor, >> >> thanks for starting this topic. it would be awesome to have Puffin >> readers/writers available to all languages supported by the Iceberg >> community! >> The topic is important for v3, but also if we want to support stats >> updates when writing to tables that already have some stats collected. >> >> i agree with Xuanwo that iceberg-rust will likely want a first-class rust >> implementation and that it would go into iceberg-rust project. >> >> Best >> Piotr >> >> >> >> On Thu, 29 Aug 2024 at 15:41, Xuanwo <xua...@apache.org> wrote: >> >>> I believe iceberg-rust should be the natural place for Puffin's rust >>> implementation, and pyiceberg can reuse iceberg-rust's implementation. >>> >>> On Thu, Aug 29, 2024, at 20:46, Gabor Kaszab wrote: >>> >>> Hi Iceberg Community, >>> >>> With the V3 position delete proposal >>> <https://docs.google.com/document/d/18Bqhr-vnzFfQk1S4AgRISkA_5_m5m32Nnc2Cw0zn2XM> >>> it came up that non-Java engines might have to implement a Puffin reader >>> and writer themselves so that they can support the newly proposed position >>> deletes. Impala would most probably require a C++ implementation and first >>> of all I'm wondering whether there are other engines that >>> require additional language implementation. >>> >>> My additional thought is that once we have an implementation in another >>> language where should that live? I can do the C++ and that could live >>> within Impala, but I think it could be useful for other engines too but I >>> have no idea where such an implementation could live TBH. >>> >>> Would be nice to hear opinions on this! >>> Thanks, >>> Gabor >>> >>> Xuanwo >>> >>> https://xuanwo.io/ >>> >>>