Hi Paul,
Thank you for the email. I think this is interesting.
Arrow (Java API) currently doesn't have the capability of automatically
limiting the memory size of record batches. In Spark we have similar needs
to limit the size of record batches and have talked about implementing some
kind of siz
Hi All,
Bryan Cutler has started a PR to fix Java checkstyle warning (Thank you
Bryan!). In my experience style is something hard to get consensus on due
to personal preference, so I wonder if we can pick a well known style guide
(say google style: https://google.github.io/styleguide/javaguide.htm
Thank you both for the explanation, it makes sense.
Another feedback I have is around flight.proto - some of the message (such
as FlightDescriptor and FlightPutInstruction) is not very clear to me - it
would be helpful to get some more explanation for those here or on the PR.
Thanks!
Li
On Sun,
Antoine Pitrou created ARROW-3125:
-
Summary: [Python] Update ASV instructions
Key: ARROW-3125
URL: https://issues.apache.org/jira/browse/ARROW-3125
Project: Apache Arrow
Issue Type: Bug
Thanks for bringing this discussion up Li. I think we can use an existing
style guide as a starting point, but ultimately we as a community should
decide how to best fit it for the project. I believe we already have the
google checkstlye as our Java rules configuration file, but already off the
bat
Hi all,
Question: If I have a set of small (10-1000 rows) RecordBatches on
disk or in memory, how can I (efficiently) concatenate/rechunk them
into larger RecordBatches (so that each column is output as a
contiguous array when written to a new Arrow buffer)?
Context: With such small RecordBatches
Wes McKinney created ARROW-3126:
---
Summary: [Python] Add buffering option to pyarrow.open_stream to
enable larger read ahead window for high latency file systems
Key: ARROW-3126
URL: https://issues.apache.org/jira/br
hi Jacob,
We have https://issues.apache.org/jira/browse/ARROW-549 about
concatenating arrays. Someone needs to write the code and tests, and
then we can easily add an API to "consolidate" table columns.
If you have small record batches, could you read the entire file into
memory before parsing it
This seems like it could be a useful addition. In general, our experience
with writing Arrow structures is that the most optimal path is using
columnar interaction rather than rowwise. That being said, most people
start out by interacting with Arrow rowwise first and having an interface
like this c
Simon Mo created ARROW-3127:
---
Summary: [C++] Add Tutorial about Sending Tensor from C++ to Python
Key: ARROW-3127
URL: https://issues.apache.org/jira/browse/ARROW-3127
Project: Apache Arrow
Issue T
Hi Jacques,
Thanks much for the note. I wonder, when reading data into, or out of, Arrow,
are not the interfaces often row-wise? For example, it is somewhat difficult to
read a CSV file column-wise. Similarly, when serving a BI tool (for tables or
charts), data must be presented row-wise. (JDBC
Kouhei Sutou created ARROW-3128:
---
Summary: [C++] Support system shared zlib
Key: ARROW-3128
URL: https://issues.apache.org/jira/browse/ARROW-3128
Project: Apache Arrow
Issue Type: Improvement
Kouhei Sutou created ARROW-3129:
---
Summary: [Packaging] Stop to use deprecated BuildRoot and Group in
.rpm
Key: ARROW-3129
URL: https://issues.apache.org/jira/browse/ARROW-3129
Project: Apache Arrow
13 matches
Mail list logo