Re: PyArrow and Parquet DELTA_BINARY_PACKED

2018-05-14 Thread Feras Salim
Hi Uwe, I'm quite confused by the findings, Im attaching a bunch of files corresponding to the version and library generating the files. On the first topic of DELTA_BINARY_PACKED. It seems it's something not well supported on the Java side as well or my implementation is off, but I just copied ov

Re: Symbol not found: _PyCObject_Type (MacOS El Capitan, Python 3.6)

2018-05-14 Thread Quang Vu
Yes Antoine, that happens when compiling Arrow under an activated conda environment. Thank you for all the info you are helping me with! Quang. On Mon, May 14, 2018 at 3:34 PM Antoine Pitrou wrote: > > To give a bit more insight: you should compile Arrow with your conda > environment activated,

Re: Symbol not found: _PyCObject_Type (MacOS El Capitan, Python 3.6)

2018-05-14 Thread Antoine Pitrou
To give a bit more insight: you should compile Arrow with your conda environment activated, so that it picks the right Python version (3.6.5, in your case). If it's still picking the wrong Python version, that might be a bug. Regards Antoine. Le 14/05/2018 à 20:50, Quang Vu a écrit : > Thanks

Re: Symbol not found: _PyCObject_Type (MacOS El Capitan, Python 3.6)

2018-05-14 Thread Quang Vu
Thanks Antoine, I will need to learn more about the compiling process that happens on my Mac, to see how that link to Python 2. I am not familiar with that process. But this is a good pointer for my issue. Thank you for your response to my issue! Quang. On Mon, May 14, 2018 at 12:50 PM Antoine

[jira] [Created] (ARROW-2581) [Java] Unify reset() interface for vectors

2018-05-14 Thread Li Jin (JIRA)
Li Jin created ARROW-2581: - Summary: [Java] Unify reset() interface for vectors Key: ARROW-2581 URL: https://issues.apache.org/jira/browse/ARROW-2581 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-2580) [GLib] Fix abs functions for Decimal128

2018-05-14 Thread yosuke shiro (JIRA)
yosuke shiro created ARROW-2580: --- Summary: [GLib] Fix abs functions for Decimal128 Key: ARROW-2580 URL: https://issues.apache.org/jira/browse/ARROW-2580 Project: Apache Arrow Issue Type: Improv

Re: Symbol not found: _PyCObject_Type (MacOS El Capitan, Python 3.6)

2018-05-14 Thread Antoine Pitrou
Hi Quang, It sounds like you have compiled Arrow against a Python 2 install but are now trying to use it with Python 3. This won't work, the same Python version must be used when compiling and when using PyArrow. ("PyCObject" is a Python 2-specific API that doesn't exist anymore in Python 3) R

Symbol not found: _PyCObject_Type (MacOS El Capitan, Python 3.6)

2018-05-14 Thread Quang Vu
Hi Arrow dev, I am having trouble with installing and setting my development environment for Arrow. I wonder if anyone is familiar with the issue. My system info: - MacOS 10.11.6 (El Capitan) - conda 4.5.1 - python 3.6.5 - arrow's current commit: 4b8511 Installing Arrow C++ libraries and Pacquet

[jira] [Created] (ARROW-2579) Appending to stremable table file format doesnt seem to work

2018-05-14 Thread Rob Ambalu (JIRA)
Rob Ambalu created ARROW-2579: - Summary: Appending to stremable table file format doesnt seem to work Key: ARROW-2579 URL: https://issues.apache.org/jira/browse/ARROW-2579 Project: Apache Arrow

RE: Appending to streaming file format

2018-05-14 Thread Ambalu, Robert
Will do, thx -Original Message- From: Antoine Pitrou [mailto:anto...@python.org] Sent: Monday, May 14, 2018 11:18 AM To: dev@arrow.apache.org Subject: Re: Appending to streaming file format Le 14/05/2018 à 17:17, Ambalu, Robert a écrit : > Cool, thanks Antoine. So this fixes being able

Re: Appending to streaming file format

2018-05-14 Thread Antoine Pitrou
Le 14/05/2018 à 17:17, Ambalu, Robert a écrit : > Cool, thanks Antoine. So this fixes being able to append to > FielOutputStream, but it still seems as though appending to an existing > streaming table not supported, is that correct? I'm not sure about that. I think the best is to open an iss

RE: Appending to streaming file format

2018-05-14 Thread Ambalu, Robert
Cool, thanks Antoine. So this fixes being able to append to FielOutputStream, but it still seems as though appending to an existing streaming table not supported, is that correct? -Original Message- From: Antoine Pitrou [mailto:anto...@python.org] Sent: Monday, May 14, 2018 11:07 AM To

Re: Appending to streaming file format

2018-05-14 Thread Antoine Pitrou
Le 14/05/2018 à 16:37, Ambalu, Robert a écrit : > > Also, fyi, I opened a ticket last week that append is broken with the > FileOutputStream ( unrelated to this email thread ) > https://github.com/apache/arrow/issues/2018 Sorry, I hadn't seen your ticket (if you have found an actual bug, it's p

Appending to streaming file format

2018-05-14 Thread Ambalu, Robert
Hey, as far as I can tell it looks like appending to a streaming file format isn't currently supported, is that right? RecordBatchStreamWriter always writes the schema up front, and it doesn't look like a schema is expected mid file ( assuming im doing this append test correctly, this is the err

PyCon Sprint Room

2018-05-14 Thread Alex Hagerman
A couple of us are setup and working on some Arrow tickets in room 14 at PyCon if anybody else is here and wants to join. Alex

Re: Continuous benchmarking setup

2018-05-14 Thread Brian Hulette
Is anyone aware of a way we could set up similar continuous benchmarks for JS? We wrote some benchmarks earlier this year but currently have no automated way of running them. Brian On 05/11/2018 08:21 PM, Wes McKinney wrote: Thanks Tom and Antoine! Since these benchmarks are literally runni

Arrow 1319 [Python] Add additional HDFS filesystem methods

2018-05-14 Thread Alex Hagerman
Hello, I was reviewing tickets to work on during the sprint days at PyCon and came across 1319. https://issues.apache.org/jira/browse/ARROW-1319 I was going to pick this up and see what I could do with it. I read the history and wanted to check if there has been any changes that might impact t