Wes McKinney created ARROW-827:
--
Summary: [Python] Variety of Parquet improvements to support Dask
integration
Key: ARROW-827
URL: https://issues.apache.org/jira/browse/ARROW-827
Project: Apache Arrow
Robert Nishihara created ARROW-826:
--
Summary: Compilation error on Mac with -DARROW_PYTHON=on
Key: ARROW-826
URL: https://issues.apache.org/jira/browse/ARROW-826
Project: Apache Arrow
Issue
Made a Jira for this issue https://issues.apache.org/jira/browse/ARROW-822
On Apr 13, 2017 11:31 AM, "Bryan Cutler" wrote:
> Hi Devs,
>
> What is the recommended way to use the pyarrow StreamWriter to write to a
> socket? I've the following:
>
> - Use socket directly and get "TypeError: Unable
If I'm the sole voice on this perspective, I'll concede the point.
I didn't even catch the increase in allowed record batch sizes as part of
ARROW-661 and ARROW-679. :(
I'm of split mind of the thoughts there:
- We need more applications so making sure that we have the features
available to supp
It seems like we could address these concerns by adding alternate
write/read APIs that do the dropping (on write) / skipping (on load)
automatically, so it doesn't have to bubble up into application logic.
On Fri, Apr 14, 2017 at 7:56 PM, Wes McKinney wrote:
> > Since Arrow already requires a ba
> Since Arrow already requires a batch to be no larger than 2^16-1 records
in
size, it won't map 1:1 to an arbitrary construct.
This is only true of some Arrow applications (e.g. Drill), which is why I
think this is an application-level concern. In ARROW-661 and ARROW-679, we
modified the metadata
To Jason's comments:
Data and control flow should be separate. Schema (a.k.a. a head-type
message) is already defined separate from a batch of records. I'm all for a
termination message as well from a stream perspective. (I don't think it
makes sense to couple record batch size to termination-- I'
Wes McKinney created ARROW-825:
--
Summary: [Python] Generalize pyarrow.from_pylist to accept any
object implementing the PySequence protocol
Key: ARROW-825
URL: https://issues.apache.org/jira/browse/ARROW-825
Speaking as a relative outsider, having the boundary cases for a transfer
protocol be MORE restrictive than the senders and receivers is asking for
boundary bugs.
In this case, both the senders and receiver think that the boundary is 0
(empty lists, empty data frames, 0 results from a database). H
I'm with Wes on this one. A bunch of systems have constructs that deal with
zero length collections, lists, iterators, etc. These are established
patterns that everyone knows they need to handle the empty case. Forcing
applications to create an unnecessary protocol complexity of a special
sentinel
Julien Le Dem created ARROW-824:
---
Summary: Date and Time Vectors should reflect timezone-less
semantics
Key: ARROW-824
URL: https://issues.apache.org/jira/browse/ARROW-824
Project: Apache Arrow
Here is valid pyarrow code that works right now:
import pyarrow as pa
rb = pa.RecordBatch.from_arrays([
pa.from_pylist([1, 2, 3]),
pa.from_pylist(['foo', 'bar', 'baz'])
], names=['one', 'two'])
batches = [rb, rb.slice(0, 0)]
stream = pa.InMemoryOutputStream()
writer = pa.StreamWriter(s
Hey All,
I had a quick comment on ARROW-783 that Wes responded to and I wanted to
elevate the conversation here for a moment.
My suggestion there was that we should disallow zero-length batches.
Wes thought that should be an application level concern. I wanted to see
what others thought.
My gen
I reviewed the currently pending PRs on the java side.
I opened 2 PRs for the opened java JIRAs from the list: ARROW-777, ARROW-720
On Fri, Apr 14, 2017 at 12:55 PM, Julien Le Dem wrote:
> I'm looking through them
>
> On Fri, Apr 14, 2017 at 9:26 AM, Wes McKinney wrote:
>
>> hi all,
>>
>> I'm w
Wes McKinney created ARROW-823:
--
Summary: [Python] Devise a means to serialize arrays of arbitrary
Python objects in Arrow IPC messages
Key: ARROW-823
URL: https://issues.apache.org/jira/browse/ARROW-823
I'm looking through them
On Fri, Apr 14, 2017 at 9:26 AM, Wes McKinney wrote:
> hi all,
>
> I'm working to close out the remaining Python and C++ stuff we wanted
> to get in to 0.3 for the sake of other Python projects that want to
> use Arrow.
>
> There are 8 patches up that touch the Java code
Bryan Cutler created ARROW-822:
--
Summary: [Python] StreamWriter fails to open with socket as sink
Key: ARROW-822
URL: https://issues.apache.org/jira/browse/ARROW-822
Project: Apache Arrow
Issue
hi all,
I'm working to close out the remaining Python and C++ stuff we wanted
to get in to 0.3 for the sake of other Python projects that want to
use Arrow.
There are 8 patches up that touch the Java codebase. If we can get all
these closed out then I think we should be able to cut a release
cand
Phillip Cloud created ARROW-821:
---
Summary: Extra file _table_api.h generated during Python build
process
Key: ARROW-821
URL: https://issues.apache.org/jira/browse/ARROW-821
Project: Apache Arrow
19 matches
Mail list logo