One use-case for ChunkedArray that comes to my mind is external sort for
large vectors.
Best,
Liya Fan
On Fri, Nov 15, 2019 at 2:14 PM Micah Kornfield
wrote:
> >
> > Maybe Java can add the concept of Tables and ChunkedArrays sometime in
> the
> > future.
>
>
> Is there a concrete use-case here?
Micah Kornfield created ARROW-7175:
--
Summary: [Website] Add a security page to track when
vulnerabilities are patched
Key: ARROW-7175
URL: https://issues.apache.org/jira/browse/ARROW-7175
Project: Ap
This sounds like a reasonable design to me. One question I had for
SchemaUnificationOptions will those only be applicable to Arrow schemas or
does it make sense to extend them for other use-cases (like DataSet APIs).
Cheers,
Micah
On Fri, Nov 8, 2019 at 10:27 AM Zhuo Peng wrote:
> Hi,
>
> http
Thanks for taking these on.
On Fri, Nov 8, 2019 at 12:20 PM David Li wrote:
> I took a look at #5630 (ARROW-6662) and #5751 (ARROW-7019).
>
> Best,
> David
>
> On 11/7/19, Micah Kornfield wrote:
> > There are a few open PRs that I think could either use a first or second
> > set of eyes:
> >
>
>
> Maybe Java can add the concept of Tables and ChunkedArrays sometime in the
> future.
Is there a concrete use-case here? It might pay to open up some JIRAs.
I'm still not 100% clear on the rationale for the way VectorSchemaRoot is
designed and how that would relate to Table/ChunkedArrays (or
#1 if there isn't a JIRA I would guess no-one is working on it (Note I
would expect at least the initial work to be in aParquet JIRA item, and
this is probably a discussion for that mailing list).
#2. There are some open PR to expose the parquet reader through JNI to java
[1]
#3. Its possible Dremi
Micah Kornfield created ARROW-7174:
--
Summary: [Python] Expose dictionary size parameter in python.
Key: ARROW-7174
URL: https://issues.apache.org/jira/browse/ARROW-7174
Project: Apache Arrow
Bryan Cutler created ARROW-7173:
---
Summary: Add test to verify Map field names can be arbitrary
Key: ARROW-7173
URL: https://issues.apache.org/jira/browse/ARROW-7173
Project: Apache Arrow
Issue
I think there are potentially other places in the Arrow code base that
"optional" could be useful (e.g. a row-reader like class for Arrow
Tables). It looks like there is at least 1 header only optional library
[1] that is c++17 forward compatible. I think I would lean towards
vendoring that or an
Hello,
I would like to add support for handling optional fields to the
parquet::StreamReader and parquet::StreamWriter classes which I recently
contributed (thank you!).
Ideally I would do this by using std::optional like this:
parquet::StreamWriter writer{ parquet::ParquetFileWriter::Op
I am not an expert on this, but it seems you can specify `*_ROOT` arguments
to cmake, like
https://github.com/apache/arrow/blob/master/ci/PKGBUILD#L90-L91
Maybe that does what you need?
Neal
On Thu, Nov 14, 2019 at 12:45 PM Tahsin Hassan
wrote:
> Hi all,
>
> I am trying to build out arrow 0.1
Hi all,
I am trying to build out arrow 0.15.1. The dependencies for arrow, e.g. thrift,
double-conversion are in a local source folder and we need to build the
dependencies from that location.
I read up on
https://github.com/apache/arrow/blob/master/docs/source/developers/cpp.rst#offline-builds
Hi, I read from https://github.com/apache/arrow/issues/4216 that it's posible
to disabling Gandiva, Plasma, or other components that you do not require.
I',m trying to deploy a aws lambda with pandas and pyarrow but I get the error
Unzipped size must be smaller than 262144000 bytes
How can I disa
Ben Kietzman created ARROW-7172:
---
Summary: [C++][Dataset] Improve format of Expression::ToString
Key: ARROW-7172
URL: https://issues.apache.org/jira/browse/ARROW-7172
Project: Apache Arrow
Issu
Yosuke Shiro created ARROW-7171:
---
Summary: [Ruby] Pass Array for Arrow::Table#filter
Key: ARROW-7171
URL: https://issues.apache.org/jira/browse/ARROW-7171
Project: Apache Arrow
Issue Type: New
I made a PR for this issue at https://github.com/apache/arrow/pull/5835.
Would love some more detail about what was intended by the initial issue
and what would be a better way.
On Tue, Nov 12, 2019 at 11:25 AM Joris Van den Bossche <
jorisvandenboss...@gmail.com> wrote:
> Sorry for the delay in
Ok, anything else do discuss? Otherwise I'll plan on a new vote with the
original language + an explicit call-out that dictionary replacement isn't
supported for the file format in the PR
On Thursday, November 14, 2019, Antoine Pitrou wrote:
>
> Right. The dictionaries can be found from the fi
ValueCount include both null and not null values. Perhaps a better name
for the method would have been setSize or setLength.
On Thursday, November 14, 2019, azim afroozeh wrote:
> Thanks for your answer. I have one more question. In this test function for
> example (
> https://github.com/apache
Thanks for your answer. I have one more question. In this test function for
example (
https://github.com/apache/arrow/blob/master/java/vector/src/test/java/org/apache/arrow/vector/TestValueVector.java#L1524)
:
there is a for loop which tries to fill in some values but not all values.
It leaves som
Antoine Pitrou created ARROW-7170:
-
Summary: [C++] Bundled ORC fails linking
Key: ARROW-7170
URL: https://issues.apache.org/jira/browse/ARROW-7170
Project: Apache Arrow
Issue Type: Bug
Antoine Pitrou created ARROW-7169:
-
Summary: [C++] Vendor uriparser library
Key: ARROW-7169
URL: https://issues.apache.org/jira/browse/ARROW-7169
Project: Apache Arrow
Issue Type: Wish
Arrow Build Report for Job nightly-2019-11-14-0
All tasks:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-11-14-0
Failed Tasks:
- homebrew-cpp:
URL:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-11-14-0-travis-homebrew-cpp
- test-conda-python-3
Thomas Buhrmann created ARROW-7168:
--
Summary: pa.array() doesn't respect provided dictionary type with
all NaNs
Key: ARROW-7168
URL: https://issues.apache.org/jira/browse/ARROW-7168
Project: Apache A
Dear all,
The problem arises from the discussion in a PR:
https://github.com/apache/arrow/pull/5544#discussion_r338394941.
We are trying to come up with a proper semantics to compare values in
UnionVectors.
According to the current logic in the code base, two values from two
UnionVectors are com
Hi Azim,
According to the current API, after filling in some values, you have to set
the value count manually (through the setValueCount method).
Otherwise, the value count remains 0.
Best,
Liya Fan
On Thu, Nov 14, 2019 at 6:33 PM azim afroozeh wrote:
> Thanks for your answer. So the valueCou
Thanks for your answer. So the valueCount shows the number of data filled
in the vector.
Then I would like to ask you why the valueCount after setting some values
is 0? for example: (
https://github.com/apache/arrow/blob/3fbbcdaf77a9e354b6bd07ec1fd1dac005a505c9/java/vector/src/test/java/org/apache
Right. The dictionaries can be found from the file footer, so it seems ok.
Thank you
Regards
Antoine.
Le 14/11/2019 à 07:11, Micah Kornfield a écrit :
> I'll add for:
>
> If so, how does this play with the fact that there potentially are delta
>> dictionaries in the "stream"?
>
> That in
Joris Van den Bossche created ARROW-7167:
Summary: [CI][Python] Add nightly tests for older pandas versions
to Github Actions
Key: ARROW-7167
URL: https://issues.apache.org/jira/browse/ARROW-7167
28 matches
Mail list logo