+1 for Rust.

On Wed, Jun 22, 2022 at 4:21 AM LuNing Wang <wang4lun...@gmail.com> wrote:

> +1 for Rust
>
> Best Regards,
> LuNing Wang
>
> Nan Zhu <zhunanmcg...@gmail.com> 于2022年6月22日周三 14:15写道:
>
>> +1 for using rust as the backbone for new language bindings
>>
>> On Sun, Jun 12, 2022 at 23:52 OpenInx <open...@gmail.com> wrote:
>>
>>> Thanks Kyle for sharing your context.
>>>
>>> Recently, I also spent some time practicing my Rust skills.  Generally,
>>> I'm +1 for adding Rust SDK support for native language.
>>>
>>>
>>> On Mon, Jun 13, 2022 at 12:51 PM Kyle Bendickson <k...@tabular.io>
>>> wrote:
>>>
>>>> Thanks for starting this discussion.
>>>>
>>>> I know I was the first to mention some of my concerns (which I still
>>>> have and would apply to any new major change), but I also think that this
>>>> is an avenue that should be explored.
>>>>
>>>> Specifically a native integration would have many benefits for
>>>> read paths (in addition to others). I know that the Rust avro reader is
>>>> significantly faster, as well as native columnar formats.
>>>>
>>>> So while I do have some concerns about making sure we have enough
>>>> people to support this endeavor, I do want to say I think it's a really
>>>> good idea. My apologies if I gave the impression otherwise.
>>>>
>>>> I would personally be interested in contributing to and reviewing for a
>>>> native Rust library (or CPP, but I think Rust is a much more elegant
>>>> language and I'd personally prefer to work in that as it's easier to work
>>>> with across systems than C++ imo though I would defer to others on that).
>>>>
>>>> I would also be happy to offer my help and perspective in moving this
>>>> forward if need be. But I did want to express my practical concerns so that
>>>> we don't have an area of the codebase where there aren't enough people to
>>>> help maintain it etc.
>>>>
>>>> But in general I think this is an exciting opportunity, and results
>>>> have shown time and time again that native readers / writers are much more
>>>> performant.
>>>>
>>>> +1 to using Rust as well (which is a language I know more of than C++
>>>> these days - though both I'd have to brush off my skillset).
>>>>
>>>> Best, Kyle
>>>>
>>>> On Sun, Jun 12, 2022 at 8:20 PM OpenInx <open...@gmail.com> wrote:
>>>>
>>>>> Hi Tao Wu.
>>>>>
>>>>> I think the apache iceberg community is very consistent in providing
>>>>> the Iceberg SDK for native languages.  I am very happy to offer my
>>>>> perspective and help if needed when you try to move this thing forward.
>>>>>
>>>>> On Mon, Jun 13, 2022 at 11:04 AM Wu Tao <wu...@apache.org> wrote:
>>>>>
>>>>>> Hi, everyone, I'm Tao. I'm currently working on a commercial
>>>>>> streaming system that is written in Rust.
>>>>>>
>>>>>> Actually, I'm planning to implement an Iceberg Rust SDK so that we
>>>>>> can have better integration with the existing Iceberg ecosystem. 
>>>>>> Initially
>>>>>> I found https://github.com/oliverdaff/iceberg-rs, but it appears the
>>>>>> author hasn't been active lately. So I'm looking to see if the Iceberg
>>>>>> community has any consensus on a Rust/C++ SDK (Rust is preferable), and 
>>>>>> if
>>>>>> there is, we'd love to contribute. I believe as Iceberg increases its
>>>>>> popularity, there will eventually be more systems that want such 
>>>>>> libraries.
>>>>>> There could have even been some ongoing works without consulting with the
>>>>>> community.
>>>>>>
>>>>>> Additionally, I think the initial Rust/C++ SDK can only support the
>>>>>> reader&writer sides of Iceberg. Because there have been plenty of 
>>>>>> JVM-based
>>>>>> query engines out there taking charge of data maintenance. We don't have 
>>>>>> to
>>>>>> rewrite every corner of Iceberg in Rust. That means less engineering 
>>>>>> work.
>>>>>>
>>>>>> On 2022/06/08 10:16:05 OpenInx wrote:
>>>>>> > As a cloud-native table format standard for the big-data
>>>>>> ecosystem,  I
>>>>>> > believe supporting multiple languages is the correct direction so
>>>>>> that
>>>>>> > different languages can connect to the apache iceberg table format.
>>>>>> >
>>>>>> > But I can also get Kyle's point about lacking enough
>>>>>> resources(developers
>>>>>> > and reviewers ) to accomplish this goal.  In my mind,  Python,
>>>>>> Golang, C++,
>>>>>> > Rust , all of them can be regarded as the native language support.
>>>>>> we may
>>>>>> > just need to support the Rust SDK and then all of the other
>>>>>> languages can
>>>>>> > just wrap the Rust SDK to access the table format.
>>>>>> >
>>>>>> > Anyway,  we will need to wait for the REST catalog finished before
>>>>>> we
>>>>>> > introduce another languages support , because we can not access the
>>>>>> iceberg
>>>>>> > table by invoking the JVM catalog interfaces.
>>>>>> >
>>>>>> > On Tue, Jun 7, 2022 at 4:41 AM Micah Kornfield <
>>>>>> emkornfi...@gmail.com>
>>>>>> > wrote:
>>>>>> >
>>>>>> > > There’s also the question of how useful this would be in practice
>>>>>> given
>>>>>> > >> the complexity of using C++ (or Rust etc) within some of the
>>>>>> major
>>>>>> > >> frameworks.
>>>>>> > >>
>>>>>> > >
>>>>>> > > One place this would be useful is for the Arrow's DataSet API
>>>>>> [1].  An
>>>>>> > > option the Arrow community might be open to is hosting parts of
>>>>>> the code
>>>>>> > > there (this is what is done for Apache Parquet C++).  This helps
>>>>>> shape some
>>>>>> > > of the answers to other questions posed (ORC and Parquet are
>>>>>> already in the
>>>>>> > > Repo, it provides a Filesystem interface, etc).  The project
>>>>>> doesn't
>>>>>> > > currently consume Avro, and I think the preferred approach is to
>>>>>> make a
>>>>>> > > clean room Avro parser.  But I agree this is a non-trivial effort
>>>>>> to get
>>>>>> > > underway.
>>>>>> > >
>>>>>> > > Another area to consider is compatibility testing.  I think
>>>>>> before a third
>>>>>> > > officially supported community library is introduced it would be
>>>>>> good to
>>>>>> > > have a compatibility framework in place to make sure
>>>>>> implementations are
>>>>>> > > all interpreting the specification correctly.  If there isn't
>>>>>> already an
>>>>>> > > effort here, I'd like to start contributing something (probably
>>>>>> will have
>>>>>> > > bandwidth sometime place in Q3).
>>>>>> > >
>>>>>> > > Thanks,
>>>>>> > > -Micah
>>>>>> > >
>>>>>> > >
>>>>>> > > [1] https://arrow.apache.org/docs/cpp/dataset.html
>>>>>> > >
>>>>>> > > On Sun, Jun 5, 2022 at 11:07 PM Kyle Bendickson <k...@tabular.io>
>>>>>> wrote:
>>>>>> > >
>>>>>> > >> Hi caneGuy,
>>>>>> > >>
>>>>>> > >> I personally don’t dislike this idea. I understand the
>>>>>> performance
>>>>>> > >> benefits.
>>>>>> > >>
>>>>>> > >> But this would be a huge undertaking for the community. We’d
>>>>>> need to
>>>>>> > >> ensure we had sufficient developer support for reviews (likely
>>>>>> one of the
>>>>>> > >> biggest issues), as well as a number of other things.
>>>>>> Particularly
>>>>>> > >> dependencies, package management, etc. We’d also need to scope
>>>>>> support down
>>>>>> > >> to specific OS / compilers etc.
>>>>>> > >>
>>>>>> > >> We’d also need to be sure we had adequate developer support from
>>>>>> a wide
>>>>>> > >> enough range of the community to support the project long term.
>>>>>> One issue
>>>>>> > >> in open source is that developers will work on something
>>>>>> tangential to
>>>>>> > >> their project in another repository, but nobody is available to
>>>>>> maintain it.
>>>>>> > >>
>>>>>> > >> There’s also the question of how useful this would be in
>>>>>> practice given
>>>>>> > >> the complexity of using C++ (or Rust etc) within some of the
>>>>>> major
>>>>>> > >> frameworks.
>>>>>> > >>
>>>>>> > >> Again, I’m not opposed to the idea but just trying to be
>>>>>> realistic about
>>>>>> > >> the realities of such an undertaking. It would need full
>>>>>> community support
>>>>>> > >> (or at least support from enough community members to be
>>>>>> sustainable).
>>>>>> > >>
>>>>>> > >> If you wanted to make a design doc, the milestones tab in the
>>>>>> Iceberg
>>>>>> > >> project has some that you might use as reference.
>>>>>> > >>
>>>>>> > >> *I highly suggest you come to the next community sync and bring
>>>>>> this up
>>>>>> > >> to the community then.*
>>>>>> > >>
>>>>>> > >> If you’re not already on the invite list for the monthly
>>>>>> community sync,
>>>>>> > >> you can get on it by joining the Google group. You’ll receive
>>>>>> incites when
>>>>>> > >> they go out:
>>>>>> > >> https://groups.google.com/g/iceberg-sync
>>>>>> > >>
>>>>>> > >> Looking forward to seeing you at the next community sync.
>>>>>> > >>
>>>>>> > >> A design document and/or any prior art would be very helpful as
>>>>>> the
>>>>>> > >> community sync does discuss many topics (possibly there is
>>>>>> existing C++
>>>>>> > >> support in StarRocks for Iceberg V1?).
>>>>>> > >>
>>>>>> > >> Thank you,
>>>>>> > >> Kyle Bendickson
>>>>>> > >> GitHub: kbendick
>>>>>> > >>
>>>>>> > >> On Sun, Jun 5, 2022 at 10:44 PM Sam Redai <s...@tabular.io>
>>>>>> wrote:
>>>>>> > >>
>>>>>> > >>> Currently there is no existing effort to develop a C++ package.
>>>>>> That
>>>>>> > >>> being said I think it would be awesome to have one! If anyone
>>>>>> is willing to
>>>>>> > >>> start that development effort, I can help with some of the
>>>>>> ground work to
>>>>>> > >>> kickstart it.
>>>>>> > >>>
>>>>>> > >>> I would say the first step would be for someone to prepare a
>>>>>> high-level
>>>>>> > >>> proposal.
>>>>>> > >>>
>>>>>> > >>> -Sam
>>>>>> > >>>
>>>>>> > >>> On Sun, Jun 5, 2022 at 11:02 PM 周康 <zhoukang199...@gmail.com>
>>>>>> wrote:
>>>>>> > >>>
>>>>>> > >>>> Hi team
>>>>>> > >>>> I am a dev from StarRocks community, and we have supported
>>>>>> iceberg v1
>>>>>> > >>>> format.
>>>>>> > >>>> We are also planning to support v2 format. If there is a C++
>>>>>> package,
>>>>>> > >>>> it will be very convenient for our implementation.
>>>>>> > >>>> At the same time, other c++ computing engines support v2
>>>>>> format will
>>>>>> > >>>> also be faster.
>>>>>> > >>>>
>>>>>> > >>>> Do we have plans to support c++ version sdk?
>>>>>> > >>>> --
>>>>>> > >>>> caneGuy
>>>>>> > >>>>
>>>>>> > >>> --
>>>>>> > >>>
>>>>>> > >>> Sam Redai <s...@tabular.io>
>>>>>> > >>>
>>>>>> > >>> Developer Advocate  |  Tabular <https://tabular.io/>
>>>>>> > >>>
>>>>>> > >>> c (267) 226-8606
>>>>>> > >>>
>>>>>> > >>
>>>>>> >
>>>>>>
>>>>>
>>>>
>>>> --
>>>>
>>>> Kyle Bendickson
>>>>
>>>> OSS Developer  |  Tabular <https://tabular.io/>
>>>>
>>>> k...@tabular.io
>>>>
>>>

-- 
Josh Howard

Reply via email to