> Would appreciate it if you can give some pointers to how to
> start playing with that code.
I have a (somewhat) minimal example here:
https://gist.github.com/westonpace/e555a3b1c269c31de7176d34f47a2fb0
The PR I mentioned earlier
(https://github.com/apache/arrow/pull/12033) has more examples (tha
Weston - Thanks for the pointer. The C++ streaming engine you pointed out
is a lot like what I have in mind. Will take a close look at that. Would
appreciate it if you can give some pointers to how to start playing with
that code.
Hou - Glad to hear that the DataFusion community has similar ideas.
For datafusion (the Rust engine that Weston mentioned), the community
is about to start building a PoC for streaming engine. The discussion
is happening at
https://github.com/apache/arrow-datafusion/issues/1544.
On Tue, Jan 11, 2022 at 3:29 PM Weston Pace wrote:
>
> First, note that there are dif
First, note that there are different computation engines in different
languages. The Rust implementation has datafusion[1] for example.
For the rest of this email, I will speak in more detail specifically
about the C++ computation engine (which I am more familiar with) that
is in place today. The
Hi,
This is a somewhat lengthy email about thoughts around a streaming
computation engine for Arrow dataset that I would like to hear feedback
from Arrow devs.
The main use cases that we are thinking for the streaming engine are time
series data, i.e., data arrives in time order (e.g. daily US st