Ji Liu created ARROW-6912:
-
Summary: [Java] Extract a common base class for avro converter
consumers
Key: ARROW-6912
URL: https://issues.apache.org/jira/browse/ARROW-6912
Project: Apache Arrow
Issue
Liya Fan created ARROW-6911:
---
Summary: [Java] Provide composite comparator
Key: ARROW-6911
URL: https://issues.apache.org/jira/browse/ARROW-6911
Project: Apache Arrow
Issue Type: New Feature
Hi John and Wes,
A few thoughts:
One of the issues which we didn't get into in prior discussions, is the
proposal is essentially changing the unit of exchange from RecordBatches to
a segment of a RecordBatch.
I think I brought this up earlier in discussions, an interesting idea that
Trill [1], a
I was definitely considering having control messages without data, and
I thought that could be encoded by a FlightData with only app_metadata
set. I think I understand your position now: FlightData should always
carry (some) data (with optional metadata)?
That makes sense to me, and is consistent
V Luong created ARROW-6910:
--
Summary: pyarrow.parquet.read_table(...) takes up lots of memory
which is not released until program exits
Key: ARROW-6910
URL: https://issues.apache.org/jira/browse/ARROW-6910
P
"that's where the danger lies"
What danger? I have no idea what the specific danger is, assuming that all
reference implementations have test cases that hedge around this.
I contend that it can only be useful and will never be harmful. What are
the counter-examples of concrete harm?
Wes McKinney created ARROW-6909:
---
Summary: [Python] Define PyObjectBuffer with Py_XDECREF logic in
destructor for object array memory
Key: ARROW-6909
URL: https://issues.apache.org/jira/browse/ARROW-6909
On Wed, Oct 16, 2019 at 12:32 PM John Muehlhausen wrote:
>
> I really need to "get into the zone" on some other development today, but I
> want to remind us of something earlier in the thread that gave me the
> impression I wasn't stomping on too many paradigms with this proposal:
>
> Wes: ``So th
hi John,
On Wed, Oct 16, 2019 at 11:59 AM John Muehlhausen wrote:
>
> I'm in Python, I'm a user, and I'm not allowed to import pyarrow because it
> isn't for me.
I think you're misrepresenting what I'm saying.
It's our expectations that users will largely consume pyarrow
indirectly as a depende
Aryan Naraghi created ARROW-6908:
Summary: Add support for Bazel
Key: ARROW-6908
URL: https://issues.apache.org/jira/browse/ARROW-6908
Project: Apache Arrow
Issue Type: New Feature
Danyang created ARROW-6907:
--
Summary: Allow Plasma store to batch notifications to clients
Key: ARROW-6907
URL: https://issues.apache.org/jira/browse/ARROW-6907
Project: Apache Arrow
Issue Type: Imp
I really need to "get into the zone" on some other development today, but I
want to remind us of something earlier in the thread that gave me the
impression I wasn't stomping on too many paradigms with this proposal:
Wes: ``So the "length" field in RecordBatch is already the utilized number
of row
Prudhvi Porandla created ARROW-6906:
---
Summary: Use re2 instead of std::regex in Dataset partitionschemes
implementation
Key: ARROW-6906
URL: https://issues.apache.org/jira/browse/ARROW-6906
Project:
I'm in Python, I'm a user, and I'm not allowed to import pyarrow because it
isn't for me.
There exists some Arrow record batches in plasma. I need to get one slice
of one batch as a pandas dataframe.
What do I do?
There exists some Arrow record batches in a file. I need to get one slice
of one
The OSX builds are failing because home-brew tries to compile the
dependencies instead of installing the precompiled binaries.
It might be because the outdated Xcode version we use, perhaps brew has
stopped providing binaries for older Xcode.
I've created a tracking jira
https://issues.apache.org/j
Krisztian Szucs created ARROW-6905:
--
Summary: [Packaging][OSX] Nightly builds on MacOS are failing
because of brew compile timeouts
Key: ARROW-6905
URL: https://issues.apache.org/jira/browse/ARROW-6905
Attendees:
Micah Kornfield
Uwe Korn
Bryan Cutler
Rok Mihevc
Prudhvi Porandla
Ursa Labs (Antoine, Ben, François, Joris, Krisztián, Neal, Wes, in the
same room!)
Discussion:
* Cython in conda: Uwe to update
* When to do 0.15.1? There are only 2 open issues left tagged with
0.15.1. Only bug fixes.
Bryan Cutler created ARROW-6904:
---
Summary: [Python] Implement MapArray and MapType
Key: ARROW-6904
URL: https://issues.apache.org/jira/browse/ARROW-6904
Project: Apache Arrow
Issue Type: Improv
On Wed, Oct 16, 2019 at 10:17 AM John Muehlhausen wrote:
>
> "pyarrow is intended as a developer-facing library, not a user-facing one"
>
> Is that really the core issue? I doubt you would want to add this proposed
> logic to pandas even though it is user-facing, because then pandas will
> either
Hey David,
RE: Async: I was trying to match the pattern we use for doget/doput for
async. Yes, more thinking java given java grpc's async always pattern.
On the comment around the FlightData, I think it is overloading the message
to use metadata for this. If I want to send a control message indep
"pyarrow is intended as a developer-facing library, not a user-facing one"
Is that really the core issue? I doubt you would want to add this proposed
logic to pandas even though it is user-facing, because then pandas will
either have to re-implement what it means to read a batch (to respect
lengt
Wes McKinney created ARROW-6903:
---
Summary: [Python] Wheels broken after ARROW-6860 changes
Key: ARROW-6903
URL: https://issues.apache.org/jira/browse/ARROW-6903
Project: Apache Arrow
Issue Type
hi John,
> As a practical matter, the reason metadata is not a good solution for me is
> that it requires awareness on the part of the reader. I want (e.g.) a
> researcher in Python to be able to map a file of batches in IPC format
> without needing to worry about the fact that the file was bu
Francois Saint-Jacques created ARROW-6902:
-
Summary: [C++] Add String*/Binary* support for Compare kernels
Key: ARROW-6902
URL: https://issues.apache.org/jira/browse/ARROW-6902
Project: Apache
Perhaps meson is also worth exploring?
Le 15/10/2019 à 23:06, Micah Kornfield a écrit :
Hi Wes,
I agree on both accounts that it won't be a done in the short term, and it
makes sense to tackle in incrementally. Like I said I don't have much
bandwidth at the moment but might be able to re-arr
Hi all, our biweekly call is coming up in a couple of hours at
https://meet.google.com/vtm-teks-phx. All are welcome to join. Notes
will be sent out to the mailing list afterwards.
Neal
Arrow Build Report for Job nightly-2019-10-16-0
All tasks:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-16-0
Failed Tasks:
- wheel-manylinux1-cp27mu:
URL:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-16-0-travis-wheel-manylinux1-cp27mu
Matthew Franglen created ARROW-6901:
---
Summary: [Rust][Parquet] Rust Parquet SerializedFileWriter writes
total_num_rows as zero
Key: ARROW-6901
URL: https://issues.apache.org/jira/browse/ARROW-6901
P
Sayed Mohammad Hossein Torabi created ARROW-6900:
Summary: PyArrow cant serialize pandas IntegerArray
Key: ARROW-6900
URL: https://issues.apache.org/jira/browse/ARROW-6900
Project: Apac
Razvan Chitu created ARROW-6899:
---
Summary: to_pandas() not implemented on
list
Key: ARROW-6899
URL: https://issues.apache.org/jira/browse/ARROW-6899
Project: Apache Arrow
Issue Type: Bug
I'll plan on starting a vote in the next day or two if there are no further
objections/comments.
On Sun, Oct 13, 2019 at 11:06 AM Micah Kornfield
wrote:
> I think the only point asked on the PR that I think is worth discussing is
> assumptions about dictionaries at the beginning of streams.
>
>
Still thinking through the implications here, but to save others from
having to go search [1] is the PR.
[1] https://github.com/apache/arrow/pull/5663/files
On Tue, Oct 15, 2019 at 1:42 PM John Muehlhausen wrote:
> A proposal with linked PR now exists in ARROW-5916 and Wes commented that
> we s
32 matches
Mail list logo