date:20200311

[jira] [Created] (ARROW-8082) [Plasma]Add JNI list() interface

2020-03-11 Thread KunshangJi (Jira)

KunshangJi created ARROW-8082: - Summary: [Plasma]Add JNI list() interface Key: ARROW-8082 URL: https://issues.apache.org/jira/browse/ARROW-8082 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-8081) Fix memory size when using huge pages in plasma; other code cleanups

2020-03-11 Thread Siyuan Zhuang (Jira)

Siyuan Zhuang created ARROW-8081: Summary: Fix memory size when using huge pages in plasma; other code cleanups Key: ARROW-8081 URL: https://issues.apache.org/jira/browse/ARROW-8081 Project: Apache Ar

Re: Coordinating / scheduling C++ Parquet-Arrow nested data work (ARROW-1644 and others)

2020-03-11 Thread Micah Kornfield

Another status update. I've integrated the level generation code with the parquet writing code [1]. After that PR is merged I'll add bindings in Python to control versions of the level generation algorithm and plan on moving on to the read side. Thanks, Micah [1] https://github.com/apache/arrow

Re: [DISCUSS] Leveraging cloud computing resources for Arrow test workloads

2020-03-11 Thread Micah Kornfield

> > * Who's going to pay for it? Perhaps Amazon, Google, or Microsoft can > donate cloud compute credits to the project Google has offered a donation of GCP credits based on some estimates I made last year when we were facing Travis CI issues. I'm happy to try to do some integration work to help m

[jira] [Created] (ARROW-8080) [C++] Add AVX512 build option

2020-03-11 Thread Frank Du (Jira)

Frank Du created ARROW-8080: --- Summary: [C++] Add AVX512 build option Key: ARROW-8080 URL: https://issues.apache.org/jira/browse/ARROW-8080 Project: Apache Arrow Issue Type: Improvement Co

Re: [DISCUSS][Java] Support non-nullable vectors

2020-03-11 Thread Jacques Nadeau

Generally Ive found that this isnt an important optimization in the use cases we see. Memory overhead, especially with our Java shared allocation scheme is nominal. Optimizing null checks at the word level usually is much more impactful since non null and null runs are much more common on a shorter

[NIGHTLY] Arrow Build Report for Job nightly-2020-03-11-0

2020-03-11 Thread Crossbow

Arrow Build Report for Job nightly-2020-03-11-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-03-11-0 Failed Tasks: - centos-8: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-03-11-0-github-centos-8 - conda-win-vs2015-py36: UR

Re: Summary of RLE and other compression efforts?

2020-03-11 Thread Wes McKinney

On Wed, Mar 11, 2020 at 11:24 AM Evan Chan wrote: > > Sure thing. > > Computation speed needs to be thought about in context We might find > something which takes up half the space to be a little more computationally > expensive, but in the grand scheme of things is faster to compute as mor

[DISCUSS] Leveraging cloud computing resources for Arrow test workloads

2020-03-11 Thread Wes McKinney

hi folks, There has periodically been a discussion about employing dedicated compute resources to serve our testing needs beyond what can be accomplished in free / public CI services like GitHub Actions, Appveyor, etc. For example: * Workloads requiring a CUDA-capable GPU * Tests requiring a lot

Re: [DISCUSS] Semantics of custom_metadata

2020-03-11 Thread Wes McKinney

I opened https://issues.apache.org/jira/browse/ARROW-8079 about the Python question On Wed, Mar 11, 2020 at 2:53 PM Neal Richardson wrote: > > While the underlying storage may allow duplicate keys, it seems much more > likely that someone would end up with duplicate keys by accident than by > des

[jira] [Created] (ARROW-8079) [Python] Implement a wrapper for KeyValueMetadata, duck-typing dict where relevant

2020-03-11 Thread Wes McKinney (Jira)

Wes McKinney created ARROW-8079: --- Summary: [Python] Implement a wrapper for KeyValueMetadata, duck-typing dict where relevant Key: ARROW-8079 URL: https://issues.apache.org/jira/browse/ARROW-8079 Projec

[jira] [Created] (ARROW-8078) [Python] Missing links in the docs regarding field and schema DataTypes

2020-03-11 Thread Jira

Otávio Vasques created ARROW-8078: - Summary: [Python] Missing links in the docs regarding field and schema DataTypes Key: ARROW-8078 URL: https://issues.apache.org/jira/browse/ARROW-8078 Project: Apac

Re: [DISCUSS] Semantics of custom_metadata

2020-03-11 Thread Neal Richardson

While the underlying storage may allow duplicate keys, it seems much more likely that someone would end up with duplicate keys by accident than by design. And although it may be up to the implementations to determine or enforce uniqueness constraints, it might be a good idea to make a project-level

[jira] [Created] (ARROW-8077) [Python] Add wheel build script and Crossbow configuration for Windows on Python 3.5

2020-03-11 Thread Wes McKinney (Jira)

Wes McKinney created ARROW-8077: --- Summary: [Python] Add wheel build script and Crossbow configuration for Windows on Python 3.5 Key: ARROW-8077 URL: https://issues.apache.org/jira/browse/ARROW-8077 Proj

Re: [DISCUSS] Semantics of custom_metadata

2020-03-11 Thread Wes McKinney

On Wed, Mar 11, 2020 at 2:22 PM Antoine Pitrou wrote: > > On Wed, 11 Mar 2020 12:44:26 -0500 > Wes McKinney wrote: > > On this note, in Python we should probably re-evaluate the data > > structure returned when accessing the "metadata" field. > > I think it's ok for the convenience API to return

Re: [DISCUSS] Semantics of custom_metadata

2020-03-11 Thread Antoine Pitrou

On Wed, 11 Mar 2020 12:44:26 -0500 Wes McKinney wrote: > On this note, in Python we should probably re-evaluate the data > structure returned when accessing the "metadata" field. I think it's ok for the convenience API to return a dict, if we also expose e.g. a "metadata_items" that returns an it

[jira] [Created] (ARROW-8076) [C++] arrow::stl::TupleRangeFromTable example includes wrong signature

2020-03-11 Thread Tomasz Cheda (Jira)

Tomasz Cheda created ARROW-8076: --- Summary: [C++] arrow::stl::TupleRangeFromTable example includes wrong signature Key: ARROW-8076 URL: https://issues.apache.org/jira/browse/ARROW-8076 Project: Apache Ar

Re: [DISCUSS] Semantics of custom_metadata

2020-03-11 Thread Wes McKinney

On this note, in Python we should probably re-evaluate the data structure returned when accessing the "metadata" field. On Wed, Mar 11, 2020 at 12:42 PM Wes McKinney wrote: > > In the C++ library at least, uniqueness is never asserted when reading > and writing the IPC metadata [1] [2]. If you us

Re: [DISCUSS] Semantics of custom_metadata

2020-03-11 Thread Wes McKinney

In the C++ library at least, uniqueness is never asserted when reading and writing the IPC metadata [1] [2]. If you use KeyValueMetadata::FindKey and the keys are non-unique, it will return the first one it finds. KeyValueMetadata::Merge assumes uniqueness, and the KeyValueMetadata::ToUnorderedMap

[DISCUSS] Semantics of custom_metadata

2020-03-11 Thread Ben Kietzman

While working on https://issues.apache.org/jira/browse/ARROW-2255 (serialize custom_metadata in the integration tests), we had the following discussion on GitHub: https://github.com/apache/arrow/pull/6556#pullrequestreview-372405940 In short, although in Schema.fbs custom_metadata is declared as a

Re: Summary of RLE and other compression efforts?

2020-03-11 Thread Evan Chan

Sure thing. Computation speed needs to be thought about in context We might find something which takes up half the space to be a little more computationally expensive, but in the grand scheme of things is faster to compute as more of it can fit in memory, and it saves I/O. I definitely a

[jira] [Created] (ARROW-8075) Loading R.utils after arrow breaks some arrow functions

2020-03-11 Thread Sam Albers (Jira)

Sam Albers created ARROW-8075: - Summary: Loading R.utils after arrow breaks some arrow functions Key: ARROW-8075 URL: https://issues.apache.org/jira/browse/ARROW-8075 Project: Apache Arrow Issue

Re: [DISCUSS][Java] Support non-nullable vectors

2020-03-11 Thread Brian Hulette

> And there is a "nullable" metadata-only flag at the > Field level. Could the same kinds of optimizations be implemented in > Java without introducing a "nullable" concept? Note Liya Fan did suggest pulling the nullable flag from the Field when the vector is created in item (1) of the proposed ch

Re: [DISCUSS][Java] Support non-nullable vectors

2020-03-11 Thread Fan Liya

Hi Micah, Thanks a lot for your valuable comments. Please see my comments inline. > I'm a little concerned that this will change assumptions for at least some > of the clients using the library (some might always rely on the validity > buffer being present). I can understand your concern and I a

[jira] [Created] (ARROW-8074) [C++][Dataset] Support for file-like objects (buffers) in FileSystemDataset?

2020-03-11 Thread Joris Van den Bossche (Jira)

Joris Van den Bossche created ARROW-8074: Summary: [C++][Dataset] Support for file-like objects (buffers) in FileSystemDataset? Key: ARROW-8074 URL: https://issues.apache.org/jira/browse/ARROW-8074

Re: Summary of RLE and other compression efforts?

2020-03-11 Thread Antoine Pitrou

Hi, Le 11/03/2020 à 06:31, Micah Kornfield a écrit : > > I still think we should be careful on what is added to the spec, in > particular, we should be focused on encodings that can be used to improve > computational efficiency rather than just smaller size. Also, it is > important to note that

Re: [Discuss] [Java] Implement vector diff functionality

2020-03-11 Thread Ji Liu

Hi Micah, Thanks for your feedback, you have opened an issue for Google's Truth[1] and it was assigned to me, I'll try to use it. Thanks, Ji Liu [1] https://issues.apache.org/jira/browse/ARROW-6931 -- From:Micah Kornfield Send Ti

Re: [Java] Port vector validate functionality

2020-03-11 Thread Ji Liu

Hi Wes and Micah, Thanks for your valuable suggestion, I will create sub-tasks under this issue as follow-up works when this one is finished. Thanks, Ji Liu -- From:Micah Kornfield Send Time:2020年3月11日(星期三) 13:42 To:dev Cc:Ji Liu

[jira] [Created] (ARROW-8082) [Plasma]Add JNI list() interface

[jira] [Created] (ARROW-8081) Fix memory size when using huge pages in plasma; other code cleanups

Re: Coordinating / scheduling C++ Parquet-Arrow nested data work (ARROW-1644 and others)

Re: [DISCUSS] Leveraging cloud computing resources for Arrow test workloads

[jira] [Created] (ARROW-8080) [C++] Add AVX512 build option

Re: [DISCUSS][Java] Support non-nullable vectors

[NIGHTLY] Arrow Build Report for Job nightly-2020-03-11-0

Re: Summary of RLE and other compression efforts?

[DISCUSS] Leveraging cloud computing resources for Arrow test workloads

Re: [DISCUSS] Semantics of custom_metadata

[jira] [Created] (ARROW-8079) [Python] Implement a wrapper for KeyValueMetadata, duck-typing dict where relevant

[jira] [Created] (ARROW-8078) [Python] Missing links in the docs regarding field and schema DataTypes

Re: [DISCUSS] Semantics of custom_metadata

[jira] [Created] (ARROW-8077) [Python] Add wheel build script and Crossbow configuration for Windows on Python 3.5

Re: [DISCUSS] Semantics of custom_metadata

Re: [DISCUSS] Semantics of custom_metadata

[jira] [Created] (ARROW-8076) [C++] arrow::stl::TupleRangeFromTable example includes wrong signature

Re: [DISCUSS] Semantics of custom_metadata

Re: [DISCUSS] Semantics of custom_metadata

[DISCUSS] Semantics of custom_metadata

Re: Summary of RLE and other compression efforts?

[jira] [Created] (ARROW-8075) Loading R.utils after arrow breaks some arrow functions

Re: [DISCUSS][Java] Support non-nullable vectors

Re: [DISCUSS][Java] Support non-nullable vectors

[jira] [Created] (ARROW-8074) [C++][Dataset] Support for file-like objects (buffers) in FileSystemDataset?

Re: Summary of RLE and other compression efforts?

Re: [Discuss] [Java] Implement vector diff functionality

Re: [Java] Port vector validate functionality

28 matches

Site Navigation

Mail list logo

Footer information