That sounds like a good solution. Having the zero-copy behavior depending
on whether you have only 1 column of a certain type or not, might lead to
surprising results. To avoid yet another keyword, only doing it when
split_blocks=True sounds good to me (in practice, that's also when it will
happen
Micah Kornfield created ARROW-7600:
--
Summary: [C++][Parquet] Add a basic disabled unit test to
excercise nesting functionality
Key: ARROW-7600
URL: https://issues.apache.org/jira/browse/ARROW-7600
Pr
Thanks, PR opened https://github.com/apache/arrow/pull/6216, please help merge
once the build turns green.
--
From:Micah Kornfield
Send Time:2020年1月17日(星期五) 14:53
To:Ji Liu
Cc:dev
Subject:Re: [CI] Java build broken on master
OK,
OK, I've opened https://issues.apache.org/jira/browse/ARROW-7599 to track.
On Thu, Jan 16, 2020 at 10:49 PM Ji Liu wrote:
> I was fixing, and will open a PR later.
>
> Thanks,
> Ji Liu
>
> --
> From:Micah Kornfield
> S
Micah Kornfield created ARROW-7599:
--
Summary: [Java] Fix build break due to change in RangeEqualsVisitor
Key: ARROW-7599
URL: https://issues.apache.org/jira/browse/ARROW-7599
Project: Apache Arrow
I was fixing, and will open a PR later.
Thanks,
Ji Liu
--
From:Micah Kornfield
Send Time:2020年1月17日(星期五) 14:48
To:dev
Subject:[CI] Java build broken on master
This was due to an unexpected conflict between two patche
This was due to an unexpected conflict between two patches I just merged.
I'm going to see if I can fix this quickly, otherwise I will rollback.
Hey Antoine,
Thanks a lot also from my side.
The build is likely currently succeeding due to the Fuzzing work done by
fuzzit. We had loads of crashes in the beginning and fixed tons of edge cases,
especially around null pointer handling.
I also have some code locally for a Parquet fuzzing s
I too, couldn't find anything that says this would break backwards
compatibility for the binary format. But it probably pays to open an issue
with the flatbuffer team just to be safe.
Two points:
1. I'd like to make sure we are conservative in choosing "definitely
required"
2. Before committing
Rockwell Shabani created ARROW-7598:
---
Summary: Unable to install pyarrow
Key: ARROW-7598
URL: https://issues.apache.org/jira/browse/ARROW-7598
Project: Apache Arrow
Issue Type: Bug
Wes McKinney created ARROW-7597:
---
Summary: [C++] Improvements to CMake configuration console summary
Key: ARROW-7597
URL: https://issues.apache.org/jira/browse/ARROW-7597
Project: Apache Arrow
I created https://issues.apache.org/jira/browse/ARROW-7596 and made it
a blocker for 0.16.0 so this does not get lost in the shuffle
On Thu, Jan 16, 2020 at 3:43 PM Wes McKinney wrote:
>
> hi Joris,
>
> Thanks for investigating this. It seems there were some unintended
> consequences of the zero-
Wes McKinney created ARROW-7596:
---
Summary: [Python] Only apply zero-copy DataFrame block
optimizations when split_blocks=True
Key: ARROW-7596
URL: https://issues.apache.org/jira/browse/ARROW-7596
Projec
If using "required" does not alter the Flatbuffers binary format (it
doesn't seem that it does, it adds non-null assertions on the write
path and additional checks in the read verifiers, is that accurate?),
then it may be worthwhile to set it on "definitely required" fields so
spare clients from ha
hi Joris,
Thanks for investigating this. It seems there were some unintended
consequences of the zero-copy optimizations from ARROW-3789. Another
way forward might be to "opt in" to this behavior, or to only do the
zero copy optimizations when split_blocks=True. What do you think?
- Wes
On Thu,
Neal Richardson created ARROW-7595:
--
Summary: [R][CI] R appveyor job fails on glob
Key: ARROW-7595
URL: https://issues.apache.org/jira/browse/ARROW-7595
Project: Apache Arrow
Issue Type: Bug
Ben Kietzman created ARROW-7594:
---
Summary: [C++] Implement HTTP and FTP file systems
Key: ARROW-7594
URL: https://issues.apache.org/jira/browse/ARROW-7594
Project: Apache Arrow
Issue Type: New
Joris Van den Bossche created ARROW-7593:
Summary: [CI][Python] Python datasets failing on master / not run
on CI
Key: ARROW-7593
URL: https://issues.apache.org/jira/browse/ARROW-7593
Project:
Antoine Pitrou created ARROW-7592:
-
Summary: [C++] Fix crashes on corrupt IPC input
Key: ARROW-7592
URL: https://issues.apache.org/jira/browse/ARROW-7592
Project: Apache Arrow
Issue Type: Bug
Arrow Build Report for Job nightly-2020-01-16-0
All tasks:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-16-0
Failed Tasks:
- centos-8:
URL:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-16-0-azure-centos-8
- gandiva-jar-osx:
URL:
htt
Hello,
In Flatbuffers, all fields are optional by default. It means that the
reader can get NULL (in C++) for a missing field. In turn, this means
that message validation (at least in C++) should check all child table
fields for non-NULL. Not only is this burdensome, but it's easy to miss
som
Joris Van den Bossche created ARROW-7591:
Summary: [Python] DictionaryArray.to_numpy returns dict of parts
instead of numpy array
Key: ARROW-7591
URL: https://issues.apache.org/jira/browse/ARROW-7591
So the spark integration build started to fail, and with the following test
error:
==
ERROR: test_toPandas_batch_order
(pyspark.sql.tests.test_arrow.EncryptionArrowTests)
---
Jiajia Li created ARROW-7590:
Summary: Update .gitignore for for thirdparty
Key: ARROW-7590
URL: https://issues.apache.org/jira/browse/ARROW-7590
Project: Apache Arrow
Issue Type: Improvement
24 matches
Mail list logo