Thanks Uwe for pointing out the Iceberg effort - will take a look. It is
good to have a "standard" Parquet-to-Arrow reader implementation live in
the Arrow project though, so that in future different projects can just
refer to this instead of implementing their own.
Chao
On Wed, Sep 4, 2019 at 10
Hello,
You may want to interact with the Apache Iceberg community here. They are
currently a similar things:
https://lists.apache.org/thread.html/3bb4f89a0b37f474cf67915f91326fa845afa597bdd2463c98a2c8b9@%3Cdev.iceberg.apache.org%3E
I'm not involved in this, just reading both mailing lists and t
Bumping this.
We may have an upcoming use case for this as well. Want to know if anyone
is actively working on this? I also heard that Dremio has internally
implemented a performant Parquet to Arrow reader. Is there any plan to open
source it? that could save us a lot of work.
Thanks,
Chao
On Fr
Hi:
I'm working on the rust part and expecting to finish this recently. I'm
also interested in the java version because we are trying to embed arrow in
spark to implement vectorized processing. Maybe we can work together.
Micah Kornfield 于 2019年8月5日周一 下午1:50写道:
> Hi Anoop,
> I think a contribut
Hi Anoop,
I think a contribution would be welcome. There was a recent discussion
thread on what would be expected from new "readers" for Arrow data in Java
[1]. I think its worth reading through but my recollections of the
highlights are:
1. A short design sketch in the JIRA that will track the
Thanks for the response Micah. I could implement this and contribute to
Arrow Java. To help me get started, are there any pointers on how the C++
or Rust implementations currently read Parquet into Arrow? Are they reading
Parquet row-by-row and building Arrow batches or are there better ways of
imp
Hi Anoop,
There isn't currently anything in the Arrow Java library that does this.
It is something that I think we want to add at some point. Dremio [1] has
some Parquet related code, but I haven't looked at it to understand how
easy it is to use as a standalone library and whether is supports pr