Thanks for bringing this up Andy.

I'm unemployed/on recovery leave, so I've had some surplus time to work on
Rust.

There's a lot of features that I've wanted to work on, some which I've
spent some time attempting, but struggled with. A few block additional work
that I could contribute.

In 0.13.0 and the release thereafter: I'd like to see:

Date/time support. I've spent a lot of time trying to implement this, but I
get the feeling that my Rust isn't good enough yet to pull this together.

More IO support.
I'm working on JSON reader, and want to work on JSON and CSV (continuing
where you left off) writers after this.
With date/time support, I can also work on date/time parsing so we can have
these in CSV and JSON.
Parquet support isn't on my radar at the moment. JSON and CSV are more
commonly used, so I'm hoping that with concrete support for these, more
people using Rust can choose to integrate Arrow. That could bring us more
hands to help.

Array slicing (https://issues.apache.org/jira/browse/ARROW-3954). I tried
working on it but failed. Related to this would be array chunking.
I need these in order to be able to operate on "Tables" like CPP, Python
and others. I've got ChunkedArray, Column and Table roughly implemented in
my fork, but without zero-copy slicing, I can't upstream them.

I've made good progress on scalar and array operations. I have trig
functions, some string operators and other functions that one can run on a
Spark-esque dataframe.
These will fit in well with DataFusion's SQL operations, but from a
decision-perspective, I think it would help if we join heads and think
about the direction we want to take on compute.

SIMD is great, and when Paddy's hashed out how it works, more of us will be
able to contribute SIMD compatible compute operators.

Thanks,
Neville

On Tue, 12 Feb 2019 at 18:12, Andy Grove <andygrov...@gmail.com> wrote:

> I was curious what our Rust committers and contributors are excited about
> for 0.13.0.
>
> The feature I would most like to see is that ability for DataFusion to run
> SQL against Parquet files again, as that would give me an excuse for a PoC
> in my day job using Arrow.
>
> I know there were some efforts underway to build arrow array readers for
> Parquet and it would make sense for me to help there.
>
> I would also like to start building out some benchmarks.
>
> I think the SIMD work is exciting too.
>
> I'd like to hear thoughts from everyone else though since we're all coming
> at this from different perspectives.
>
> Thanks,
>
> Andy.
>

Reply via email to