KunshangJi created ARROW-8082:
-
Summary: [Plasma]Add JNI list() interface
Key: ARROW-8082
URL: https://issues.apache.org/jira/browse/ARROW-8082
Project: Apache Arrow
Issue Type: Improvement
Siyuan Zhuang created ARROW-8081:
Summary: Fix memory size when using huge pages in plasma; other
code cleanups
Key: ARROW-8081
URL: https://issues.apache.org/jira/browse/ARROW-8081
Project: Apache Ar
Another status update. I've integrated the level generation code with the
parquet writing code [1].
After that PR is merged I'll add bindings in Python to control versions of
the level generation algorithm and plan on moving on to the read side.
Thanks,
Micah
[1] https://github.com/apache/arrow
>
> * Who's going to pay for it? Perhaps Amazon, Google, or Microsoft can
> donate cloud compute credits to the project
Google has offered a donation of GCP credits based on some estimates I made
last year when we were facing Travis CI issues. I'm happy to try to do some
integration work to help m
Frank Du created ARROW-8080:
---
Summary: [C++] Add AVX512 build option
Key: ARROW-8080
URL: https://issues.apache.org/jira/browse/ARROW-8080
Project: Apache Arrow
Issue Type: Improvement
Co
Generally Ive found that this isnt an important optimization in the use
cases we see. Memory overhead, especially with our Java shared allocation
scheme is nominal. Optimizing null checks at the word level usually is much
more impactful since non null and null runs are much more common on a
shorter
Arrow Build Report for Job nightly-2020-03-11-0
All tasks:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-03-11-0
Failed Tasks:
- centos-8:
URL:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-03-11-0-github-centos-8
- conda-win-vs2015-py36:
UR
On Wed, Mar 11, 2020 at 11:24 AM Evan Chan wrote:
>
> Sure thing.
>
> Computation speed needs to be thought about in context We might find
> something which takes up half the space to be a little more computationally
> expensive, but in the grand scheme of things is faster to compute as mor
hi folks,
There has periodically been a discussion about employing dedicated
compute resources to serve our testing needs beyond what can be
accomplished in free / public CI services like GitHub Actions,
Appveyor, etc. For example:
* Workloads requiring a CUDA-capable GPU
* Tests requiring a lot
I opened https://issues.apache.org/jira/browse/ARROW-8079 about the
Python question
On Wed, Mar 11, 2020 at 2:53 PM Neal Richardson
wrote:
>
> While the underlying storage may allow duplicate keys, it seems much more
> likely that someone would end up with duplicate keys by accident than by
> des
Wes McKinney created ARROW-8079:
---
Summary: [Python] Implement a wrapper for KeyValueMetadata,
duck-typing dict where relevant
Key: ARROW-8079
URL: https://issues.apache.org/jira/browse/ARROW-8079
Projec
Otávio Vasques created ARROW-8078:
-
Summary: [Python] Missing links in the docs regarding field and
schema DataTypes
Key: ARROW-8078
URL: https://issues.apache.org/jira/browse/ARROW-8078
Project: Apac
While the underlying storage may allow duplicate keys, it seems much more
likely that someone would end up with duplicate keys by accident than by
design. And although it may be up to the implementations to determine or
enforce uniqueness constraints, it might be a good idea to make a
project-level
Wes McKinney created ARROW-8077:
---
Summary: [Python] Add wheel build script and Crossbow
configuration for Windows on Python 3.5
Key: ARROW-8077
URL: https://issues.apache.org/jira/browse/ARROW-8077
Proj
On Wed, Mar 11, 2020 at 2:22 PM Antoine Pitrou wrote:
>
> On Wed, 11 Mar 2020 12:44:26 -0500
> Wes McKinney wrote:
> > On this note, in Python we should probably re-evaluate the data
> > structure returned when accessing the "metadata" field.
>
> I think it's ok for the convenience API to return
On Wed, 11 Mar 2020 12:44:26 -0500
Wes McKinney wrote:
> On this note, in Python we should probably re-evaluate the data
> structure returned when accessing the "metadata" field.
I think it's ok for the convenience API to return a dict, if we also
expose e.g. a "metadata_items" that returns an it
Tomasz Cheda created ARROW-8076:
---
Summary: [C++] arrow::stl::TupleRangeFromTable example includes
wrong signature
Key: ARROW-8076
URL: https://issues.apache.org/jira/browse/ARROW-8076
Project: Apache Ar
On this note, in Python we should probably re-evaluate the data
structure returned when accessing the "metadata" field.
On Wed, Mar 11, 2020 at 12:42 PM Wes McKinney wrote:
>
> In the C++ library at least, uniqueness is never asserted when reading
> and writing the IPC metadata [1] [2]. If you us
In the C++ library at least, uniqueness is never asserted when reading
and writing the IPC metadata [1] [2]. If you use
KeyValueMetadata::FindKey and the keys are non-unique, it will return
the first one it finds. KeyValueMetadata::Merge assumes uniqueness,
and the KeyValueMetadata::ToUnorderedMap
While working on https://issues.apache.org/jira/browse/ARROW-2255
(serialize custom_metadata in the integration tests), we had the following
discussion on GitHub:
https://github.com/apache/arrow/pull/6556#pullrequestreview-372405940
In short, although in Schema.fbs custom_metadata is declared as a
Sure thing.
Computation speed needs to be thought about in context We might find
something which takes up half the space to be a little more computationally
expensive, but in the grand scheme of things is faster to compute as more of it
can fit in memory, and it saves I/O. I definitely a
Sam Albers created ARROW-8075:
-
Summary: Loading R.utils after arrow breaks some arrow functions
Key: ARROW-8075
URL: https://issues.apache.org/jira/browse/ARROW-8075
Project: Apache Arrow
Issue
> And there is a "nullable" metadata-only flag at the
> Field level. Could the same kinds of optimizations be implemented in
> Java without introducing a "nullable" concept?
Note Liya Fan did suggest pulling the nullable flag from the Field when the
vector is created in item (1) of the proposed ch
Hi Micah,
Thanks a lot for your valuable comments. Please see my comments inline.
> I'm a little concerned that this will change assumptions for at least some
> of the clients using the library (some might always rely on the validity
> buffer being present).
I can understand your concern and I a
Joris Van den Bossche created ARROW-8074:
Summary: [C++][Dataset] Support for file-like objects (buffers) in
FileSystemDataset?
Key: ARROW-8074
URL: https://issues.apache.org/jira/browse/ARROW-8074
Hi,
Le 11/03/2020 à 06:31, Micah Kornfield a écrit :
>
> I still think we should be careful on what is added to the spec, in
> particular, we should be focused on encodings that can be used to improve
> computational efficiency rather than just smaller size. Also, it is
> important to note that
Hi Micah,
Thanks for your feedback, you have opened an issue for Google's Truth[1] and it
was assigned to me, I'll try to use it.
Thanks,
Ji Liu
[1] https://issues.apache.org/jira/browse/ARROW-6931
--
From:Micah Kornfield
Send Ti
Hi Wes and Micah,
Thanks for your valuable suggestion, I will create sub-tasks under this issue
as follow-up works when this one is finished.
Thanks,
Ji Liu
--
From:Micah Kornfield
Send Time:2020年3月11日(星期三) 13:42
To:dev
Cc:Ji Liu
28 matches
Mail list logo