Re: Snappy illegal instructions errors & pypi build

2020-07-01 Thread Weston Pace
Great. That is what I was hoping for. Thanks. On Wed, Jul 1, 2020 at 3:07 PM Wes McKinney wrote: > The conda packages are used on Windows to build the wheel so the next > release should contain this fix > > > https://github.com/apache/arrow/blob/master/dev/tasks/python-wheels/win-build.bat#L25

Re: Snappy illegal instructions errors & pypi build

2020-07-01 Thread Wes McKinney
The conda packages are used on Windows to build the wheel so the next release should contain this fix https://github.com/apache/arrow/blob/master/dev/tasks/python-wheels/win-build.bat#L25 On Wed, Jul 1, 2020 at 4:47 PM Weston Pace wrote: > > I have a customer that has encountered what I believe

Snappy illegal instructions errors & pypi build

2020-07-01 Thread Weston Pace
I have a customer that has encountered what I believe to be https://issues.apache.org/jira/browse/ARROW-9114 They are running Windows. They receive an illegal instruction exception on pyarrow.parquet.read_table. Their processor (i5-3470) does not support BMI2. The customer is using the pypi dis

Upcoming JS fixes and release timeline

2020-07-01 Thread Paul Taylor
The TypeScript compiler has made breaking changes in recent releases, meaning we can't easily upgrade past 3.5 and projects on 3.6+ can't compile our types. I'm working on upgrading our tsc dependency to 3.9. The fixes could include a few backwards-incompatible API changes, and might not be do

RE: Decimal128 scale limits

2020-07-01 Thread Kazuaki Ishizaki
Hi, According to https://arrow.apache.org/docs/cpp/api/utilities.html, Decimal128 comes from the Apache ORC C++ implementation. When I see the Hive document at https://hive.apache.org/javadocs/r1.2.2/api/index.html?org/apache/hadoop/hive/common/type/Decimal128.html , there is the following state

Re: Decimal128 scale limits

2020-07-01 Thread Jacek Pliszka
Hi! I am aware about at least 2 different decimal128 things: a) the one we have - where we use 128 bits to store integer which is later shifted by scale - 38 is number of digits of significand i.e. digits fitting in 128 bits (2**128/10**38) - IMHO it is completely unrelated to scale which we sto

Re: [Discuss] Extremely dubious Python equality semantics

2020-07-01 Thread Joris Van den Bossche
I am personally fine with removing the compute dunder methods again (i.e. Array.__richcmp__), if that resolves the ambiguity. Although they *are* convenient IMO, even for developers (question might also come up if we want to add __add__, __sub__ etc, though). So it could also be an option to say th

Re: [Discuss] Extremely dubious Python equality semantics

2020-07-01 Thread Wes McKinney
I think we need to have a hard separation between "data structure equality" (do these objects contain equivalent data) and "analytical/semantic equality". The latter is more the domain of pyarrow.compute and I am not sure we should be overloading dunder methods with compute functions. I might recom

Decimal128 scale limits

2020-07-01 Thread Antoine Pitrou
Hello, Are there limits to the value of the scale for either decimal128 or decimal? Can it be negative? Can it be greater than 38 (and/or lower than -38)? It's not clear from looking either at the spec or at the C++ code... Regards Antoine.

Re: [Discuss] Extremely dubious Python equality semantics

2020-07-01 Thread Maarten Breddels
I think that if __eq__ does not return True/False exclusively, __bool__ should raise an exception to avoid these unexpected truthies. Python users are used to that due to Numpy. Op wo 1 jul. 2020 om 15:40 schreef Joris Van den Bossche < jorisvandenboss...@gmail.com>: > On Wed, 1 Jul 2020 at 09:4

Re: [Discuss] Extremely dubious Python equality semantics

2020-07-01 Thread Joris Van den Bossche
On Wed, 1 Jul 2020 at 09:46, Antoine Pitrou wrote: > > Hello, > > Recent changes to PyArrow seem to have taken the stance that comparing > null values should return null. Small note that it is not a *very* recent change ( https://github.com/apache/arrow/pull/5330, ARROW-6488

Re: [DISCUSS] Ongoing LZ4 problems with Parquet files

2020-07-01 Thread Antoine Pitrou
I don't have a sense of how conservative Parquet users generally are. Is it worth adding a LZ4_FRAMED compression option in the Parquet format, or would people just not use it? Regards Antoine. On Tue, 30 Jun 2020 14:33:17 +0200 "Uwe L. Korn" wrote: > I'm also in favor of disabling support f

Re: [Discuss] Extremely dubious Python equality semantics

2020-07-01 Thread Krisztián Szűcs
On Wed, Jul 1, 2020 at 9:46 AM Antoine Pitrou wrote: > > > Hello, > > Recent changes to PyArrow seem to have taken the stance that comparing > null values should return null. This is actually how the previous versions work: https://github.com/apache/arrow/blob/master/python/pyarrow/scalar.pxi#L51

[NIGHTLY] Arrow Build Report for Job nightly-2020-07-01-0

2020-07-01 Thread Crossbow
Arrow Build Report for Job nightly-2020-07-01-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-07-01-0 Failed Tasks: - test-conda-cpp-valgrind: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-07-01-0-github-test-conda-cpp-valgrind

RE: [Discuss] Extremely dubious Python equality semantics

2020-07-01 Thread Mehul Batra
Hey Arrow Community, Do we have any work going on to produce/consume data from kafka and process it using arrow in python or any library involved in that. Just like we have a snowflake connector for python to read data super fast with the help of arrow. Thanks , Mehul -Original Message

Re: [Discuss] Extremely dubious Python equality semantics

2020-07-01 Thread Antoine Pitrou
Note if the snippet below doesn't display right in your e-mail reader, you can read it here: https://gist.github.com/pitrou/6a0ce89ce866bc0c70e33155503d1c47 Le 01/07/2020 à 09:46, Antoine Pitrou a écrit : > > Hello, > > Recent changes to PyArrow seem to have taken the stance that comparing > n

[Discuss] Extremely dubious Python equality semantics

2020-07-01 Thread Antoine Pitrou
Hello, Recent changes to PyArrow seem to have taken the stance that comparing null values should return null. The problem is that it breaks the expectation that comparisons should return booleans, and perculates into crazy behaviour in other places. Here is an example of such misbehaviour in t