This is a little off topic, but I noticed David and Sotou's replies today (13/08/25) when I checked lists.apache.dev. However, I haven't received their reply via email yet (no, it's not in my spam nor in my bin). Additionally, my original email took a few days to appear on Pony Mail. Is there a way to debug this so that I can give a proper quoted response from my email client?
-- bp On Fri, 8 Aug, 2025, 9:41 pm Benjamin Philip, <benjamin.philip...@gmail.com> wrote: > Hi, > > I am working on an Erlang implementation for Apache Arrow, and I am > interested in submitting it to the Apache Foundation as an official > implementation for Erlang and Elixir, once it is ready. > > If you haven't heard of Erlang[1], you can read more about it here[2]. > It's most famous for powering telecom switches, rabbitmq, and instant > messaging apps like Whatsapp, Discord, and ejabberd. The important thing to > keep in mind is it's used in highly parallel and distributed environments. > It runs on a runtime called the BEAM/Erlang Virtual Machine (analogous to > Java and the JVM). Being a functional language[3], all values are immutable > and there is no SIMD support. > > Initial work[4] was started 2 years ago for compliance with some new > OpenTelemetry specifications. However, my focus so far has only been > (de)serialization and not operating on/manipulating Arrow Arrays since that > was the only requirement in OpenTelemetry. > > The trouble with Erlang, is that natively producing and decoding binaries > in pure Erlang is more effective than through a C FFI. This has also been > the case with plaintext formats like JSON and XML, and with parsing markup > like HTML and Markdown. This has meant that we've had to write an Erlang > Arrow implementation from the ground up. The lack of an Erlang flatbuffer > implementation (for IPC), SIMD support in the Erlang Virtual Machine (for > efficient operations) and mutability (for zero-copy access; all values in > Erlang are immutable) make a complete Arrow implementation in Erlang > especially challenging. > > An alternative could be to handle serializations in Erlang and operations > with the C bindings. We could also start with a minimal implementation with > bindings to nanoarrow and deprecate that in favour of the Erlang one later. > > Upstreaming a fully compliant Erlang implementation could potentially be a > multi-year project. This might also include writing an Erlang flatbuffers > implementation. This will also be an additional implementation for the > Arrow team to maintain, though I would be happy to aid in developing and > maintaining it. What are the steps to get this going? > > How are implementations out of the mono repo tested? Is there any guide > for setting up integration testing and benchmarking in third-party > implementations? So far I've had to roll my own minimal tooling for what > archery supports, and I would prefer if I could integrate with > archery instead. > > Additionally, the initial work for this project was sponsored by the > Erlang Ecosystem Foundation[5]. Would this be an issue when transferring > stewardship to the ASF? > [1]: https://en.wikipedia.org/wiki/Erlang_(programming_language) > [2]: https://www.erlang.org/ > [3]: https://en.wikipedia.org/wiki/Functional_programming > [4]: https://github.com/Benjamin-Philip/serde_arrow > [5]: https://erlef.org/ > > -- bp >