I like the vision of building multi-langage sdk on top rust kernel. Apache
OpenDAL <https://opendal.apache.org/> is also another successful story of
this approach.

On Fri, Aug 16, 2024 at 11:35 AM Zheng Hu <open...@gmail.com> wrote:

> From my understanding,  the most abstracted approach for implementing a
> multi-language SDK is:  building another language SDK (Python, Ruby, Go,
> etc) on top of the Iceberg-Rust SDK.  In this case, we can make our
> community resources focus on the rust native kernel, and all of the other
> language bindings will have the same progress and support.
>
> This may put higher requirements on our iceberg-rust:
> 1. The community needs more people who understand both iceberg and rust;
> 2. The API of iceberg-rust needs to be more abstract and universal, so
> that we can more easily export it into native API and use it in other
> language bindings.
>
> Anyway,  I personally think that abstracting the problem of multi-language
> and focusing on iceberg-rust will be the right direction. Just like what we
> see lancedb[1] doing, its kernel is very focused on the rust kernel, based
> on which it has successfully built a multi-language ecosystem of python and
> nodejs. In the future, I think it is also very reasonable and easy to
> expand to a big data ecosystem like Java.
>
> 1. https://github.com/lancedb/lancedb
>
> On Thu, Aug 8, 2024 at 7:09 AM Chris Atkins <chri...@buildkite.com.invalid>
> wrote:
>
>> > Do you know how big the Ruby data community is? I think the most
>> important part is that it gets some traction and will continue to be
>> maintained.
>>
>> Its a great question Fokko! I'd say that the data community in Ruby is
>> nascent, but definitely exists. There are some prolific folks like
>> Andrew Kane (the fellow who created pgvector)
>> https://ankane.org/opensource who have released a lot of data related
>> gems, and there are a healthy set of bindings for Apache Arrow, Avro and
>> friends.
>>
>> In my experience, data technology in Ruby often shows up for very
>> particular use-cases within a larger Ruby on Rails application. An
>> example would be using the DuckDB, Arrow or Polars binding gems to do data
>> export or reverse-ETL with Parquet files in object storage; where parts of
>> the process work with the core domain objects. Another use-case is for
>> user or administrator-facing reporting features. My own use-case is wanting
>> to (eventually) perform some reads/write from some iceberg tables directly
>> from our Rails monolith without needing to call out to Trino.
>>
>> Thanks,
>>
>> Chris Atkins
>> Principal Engineer
>> Buildkite
>>
>> On Tue, 6 Aug 2024 at 17:14, Fokko Driesprong <fo...@apache.org> wrote:
>>
>>> Hi Chris,
>>>
>>> Thanks for raising this. Do you know how big the Ruby data community is?
>>> I think the most important part is that it gets some traction and will
>>> continue to be maintained.
>>>
>>> I fully agree that building on top of iceberg-rust makes a lot of sense,
>>> since also with PyIceberg we're running into limitations when it comes to
>>> performance and limited parallelism.
>>>
>>> Kind regards,
>>> Fokko
>>>
>>> Op ma 5 aug 2024 om 14:14 schreef Xuanwo <xua...@apache.org>:
>>>
>>>> Hi, Chris
>>>>
>>>> I love this idea. One of the main reasons I started working on
>>>> iceberg-rust is due to the potential that a rust-powered iceberg core can
>>>> offer.
>>>>
>>>> I'm not an experienced ruby developer, but I'm willing to help with
>>>> some CI setup or docs since I have some experience in the opendal community
>>>> with ruby bindings.
>>>>
>>>> On Mon, Aug 5, 2024, at 20:03, Renjie Liu wrote:
>>>>
>>>> Hi, Chris:
>>>>
>>>> Thanks for raising this. Generally I'm +1 with building ruby bindings
>>>> on top of rust implementation, who would help introduce iceberg into the
>>>> ruby ecosystem.
>>>>
>>>> On Mon, Aug 5, 2024 at 7:30 PM Chris Atkins
>>>> <chri...@buildkite.com.invalid> wrote:
>>>>
>>>> Hi there,
>>>>
>>>> I'm following up on a discussion
>>>> <https://apache-iceberg.slack.com/archives/C05HTENMJG4/p1722750831522969> 
>>>> from
>>>> the #rust channel on the Iceberg community slack, so starting a thread here
>>>> too.
>>>>
>>>> After seeing Xuanwo's and Song's recent proposals around leveraging
>>>> iceberg-rust to power parts of PyIceberg, I was thinking it could be
>>>> valuable to follow a similar pattern to build out Ruby bindings for
>>>> Iceberg. Being able to stand on the shoulders of iceberg-rust could really
>>>> help build out a robust Ruby interface, and also offer some opportunities
>>>> for interop with things like datafusion and opendal.
>>>>
>>>> Recently in the Ruby ecosystem, writing native extensions in Rust has
>>>> become more popular, and tools like rb-sys and magnus provide a lot of the
>>>> required infrastructure. A good example is ruby-polars, which provides an
>>>> interface that is idiomatic Ruby but retains good symmetry with the APIs
>>>> exposed by py-polars. I wonder if we could eventually aim for a similar
>>>> type of symmetry between PyIceberg and a Ruby gem?
>>>>
>>>> Is there much interest in this? I've started playing around with some
>>>> of the basics, and started out with a plain native Ruby implementation of
>>>> some of the basic metadata APIs, but quickly realised that building on
>>>> iceberg-rust could be more productive than writing it all from scratch.
>>>>
>>>> *References*
>>>>
>>>> https://lists.apache.org/thread/5570vbdkrk7mdswt4jqy45lv7y58pz4b
>>>> https://lists.apache.org/thread/33c0nkc3k6646lvro1lv22pvhwlp50ss
>>>> https://github.com/apache/iceberg-rust/pull/518
>>>>
>>>> *Prior Art in Ruby*
>>>>
>>>> https://github.com/matsadler/magnus
>>>> https://github.com/oxidize-rb/rb-sys
>>>> https://github.com/ankane/ruby-polars
>>>> https://github.com/apache/opendal/tree/main/bindings/ruby
>>>>
>>>> Thanks,
>>>> Chris Atkins
>>>>
>>>> Xuanwo
>>>>
>>>> https://xuanwo.io/
>>>>
>>>>

Reply via email to