I think it is a great idea to release the python bindings.
In terms of binary / source releases, one approach that also work could be
1. sign / vote on a source release of DataFusion as a whole
2. build and push the binaries based on that approved source (much like the
various Linux distributions
Sounds good to me. I'd recommend that you document the release process,
whenever it is agreed upon, on
https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide
(or document it somewhere and link to it there).
Neal
On Wed, Jul 21, 2021 at 8:35 AM Andrew Lamb wrote:
> I think i
Hi all,
Our biweekly sync call is today at 12:00 noon Eastern time.
For today's call, let's please us this Google Meet URL (different from the
usual one):
https://meet.google.com/ebp-tczo-xjn
All are welcome to join. Notes will be shared with the mailing list
afterward.
Thanks,
Ian
Hi Micah,
Thank you for this wonderful description. You've solved my problem exactly.
Responses inline:
> "ReadBatchSpaced() in a loop isfaster than reading an entire record
> > batch."
>
> Could you elaborate on this? What code path were you using for reading
> record batches that was slower?
Hello Apache Arrow Team,
I am looking at ways my company can create an SDK that can share apache arrow
data while preserving table pivots. I was looking at how Pandas and Perspective
do it and it seems like
For row_pivots
Pandas just sorts the data into a flat arrow structure
Perspective actu
If dictionary encoded data is specifically a concern, we've added new
experimental APIs that should be in the next release that allows for
retrieving dictionary data as indexes + dictionaries
(ReadBatchWithDictionary) instead of denormalizing them as ReadBatch does.
-Micah
On Wed, Jul 21, 2021 at