Responding to this thread late but just wanted to share that I've made
significant progress developing Rust implementations of Puffin reader
and writer. You can view the PR here:
https://github.com/apache/iceberg-rust/pull/714
Thanks,
Farooq
On 2024/11/22 16:25:28 Zoltán Borók-Nagy wrote:
> Awesome! In Impala we created our own implementations so far, but it
> will be nice to join forces and have a common library.
>
> Looking forward to the Slack channel.
>
> Cheers,
> Zoltan
>
>
> On Fri, Nov 22, 2024 at 5:01 PM Gang Wu <us...@gmail.com> wrote:
> >
> > I have created an issue [1] to collect initial ideas for the
iceberg-cpp project.
> >
> > Any feedback is appreciated.
> >
> > [1] https://github.com/apache/iceberg-cpp/issues/2
> >
> > Best,
> > Gang
> >
> > On Fri, Nov 22, 2024 at 10:11 PM Matt Topol <zo...@gmail.com> wrote:
> >>
> >> I will also help out with the iceberg-cpp effort, please include
me on the channel. While my focus will still be on iceberg-go, I'll
happily review and contribute to the C++ implementation.
> >>
> >> I do also plan on eventually implementing puffin in the iceberg-go
repo lol
> >>
> >> --Matt
> >>
> >> On Fri, Nov 22, 2024 at 8:55 AM Raúl Cumplido <ra...@apache.org>
wrote:
> >>>
> >>> This sounds awesome. I am looking forward to the slack channel being
> >>> available so I can also help!
> >>>
> >>> El vie, 22 nov 2024 a las 10:03, Gang Wu (<us...@gmail.com>)
escribió:
> >>> >
> >>> > Thanks for the support, Fokko and JB!
> >>> >
> >>> > Please include me in the cpp slack channel for future cooperation.
> >>> >
> >>> > Best,
> >>> > Gang
> >>> >
> >>> > On Fri, Nov 22, 2024 at 4:58 PM Jean-Baptiste Onofré
<jb...@nanthrax.net> wrote:
> >>> >>
> >>> >> Hi Gabor,
> >>> >>
> >>> >> I think it makes sense to create iceberg-cpp resources
(repository,
> >>> >> slack channel, ...): this can gather the efforts in the
Iceberg lib
> >>> >> and Puffin implementation.
> >>> >>
> >>> >> Fokko can help there, he can ping me if needed (from ASF
standpoint) :)
> >>> >>
> >>> >> Regards
> >>> >> JB
> >>> >>
> >>> >> On Fri, Nov 22, 2024 at 9:25 AM Gabor Kaszab
<ga...@apache.org> wrote:
> >>> >> >
> >>> >> > Hi Iceberg Community,
> >>> >> >
> >>> >> > It's been a while since we started this discussion. I'd like
to revive the conversion for two reasons:
> >>> >> > 1) I think I'll have some capacity starting from early next
year to take care of the C++ Puffin stuff we already talked about above,
and also from the Impala community we could add some additional
auxiliary functionality for the V3 positional deletes later on.
> >>> >> > 2) I learned that a part of the community is interested in
having a C++ implementation of the Iceberg lib in general for their C++
engine. cc @Gang Wu
> >>> >> >
> >>> >> > There seemed to be general support from the community to
start up such a sub-project, so I'm reaching out now to ask for some
guidance so that we can get going. @Fokko Driesprong You have much
experience in this area, do you think you have the bandwidth to handhold
us with the process and the steps? Meanwhile, I'll take a look at the
Python repo to get a feel for myself.
> >>> >> >
> >>> >> > Regards,
> >>> >> > Gabor
> >>> >> >
> >>> >> > On Wed, Sep 11, 2024 at 5:32 PM Anton Okolnychyi
<ao...@gmail.com> wrote:
> >>> >> >>
> >>> >> >> If C++ engines prefer not to depend on Iceberg Rust, I
actually don't see a problem with having a separate C++ project even if
it only contains Puffin readers/writers. The important part is to avoid
multiple C++ writer/reader implementations in different engines.
> >>> >> >>
> >>> >> >> There were concerns with having one repo for multiple
languages when we discussed how to maintain the main project. We should
re-evaluate if they still apply here.
> >>> >> >>
> >>> >> >> - Anton
> >>> >> >>
> >>> >> >> вт, 10 вер. 2024 р. о 06:08 Gabor Kaszab <ga...@apache.org>
пише:
> >>> >> >>>
> >>> >> >>> Hey All,
> >>> >> >>>
> >>> >> >>> Thanks for all the answers!
> >>> >> >>>
> >>> >> >>> My initial thought here was to have an iceberg-cpp repo
for the Puffin reader/writer implementation in C++. But then I thought
that there might be only one thing within this cpp repo, that is the
Puffin stuff, and once implemented the repo won't have any more PRs
coming in so it might be an overkill to introduce a separate repo for
this purpose. Unless there is a need from other stakeholders to have a
C++ implementation of the Iceberg lib (similarly to Python and Rust) but
this would lead to a different conversation.
> >>> >> >>>
> >>> >> >>> Pulling in the Rust implementation into the C++ part of
Impala: I don't know, I have to do some research on this. For
performance reasons I know Impala prefers to have its own implementation
of things (like Parquet reader/writer), so I have to double-check if
it's acceptable performance-wise to pull in a Rust implementation of
anything. I don't have any experience of doing so, hence the hesitation.
> >>> >> >>>
> >>> >> >>> JB, you mean that we can create a sub-repo like
iceberg-puffin, that would hold separate implementations for different
languages? For me this could also work. Wondering what others think.
> >>> >> >>>
> >>> >> >>> Regards,
> >>> >> >>> Gabor
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> On Fri, Aug 30, 2024 at 2:08 PM Jean-Baptiste Onofré
<jb...@nanthrax.net> wrote:
> >>> >> >>>>
> >>> >> >>>> Hi Gabor
> >>> >> >>>>
> >>> >> >>>> I like the idea, it sounds good to me.
> >>> >> >>>>
> >>> >> >>>> Imho, for "clarity", the best option would be to have a
dedicated
> >>> >> >>>> puffin repo with different language binding (at bit like
it's done in
> >>> >> >>>> Arrow).
> >>> >> >>>> I think that adding to iceberg-rust could be an option,
but not sure
> >>> >> >>>> it would be obvious for developers.
> >>> >> >>>>
> >>> >> >>>> Regards
> >>> >> >>>> JB
> >>> >> >>>>
> >>> >> >>>> On Thu, Aug 29, 2024 at 2:46 PM Gabor Kaszab
<ga...@apache.org> wrote:
> >>> >> >>>> >
> >>> >> >>>> > Hi Iceberg Community,
> >>> >> >>>> >
> >>> >> >>>> > With the V3 position delete proposal it came up that
non-Java engines might have to implement a Puffin reader and writer
themselves so that they can support the newly proposed position deletes.
Impala would most probably require a C++ implementation and first of all
I'm wondering whether there are other engines that require additional
language implementation.
> >>> >> >>>> >
> >>> >> >>>> > My additional thought is that once we have an
implementation in another language where should that live? I can do the
C++ and that could live within Impala, but I think it could be useful
for other engines too but I have no idea where such an implementation
could live TBH.
> >>> >> >>>> >
> >>> >> >>>> > Would be nice to hear opinions on this!
> >>> >> >>>> > Thanks,
> >>> >> >>>> > Gabor
>