Wes, We use Arrow C++ (not PyArrow) exclusively for writing and PySpark for manipulation and analysis. I'm wondering if there are any plans for Arrow C++ to implement something similar to flavor='spark' in PyArrow.
Sent with ProtonMail Secure Email. ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Tuesday, October 29, 2019 8:47 AM, Wes McKinney <wesmck...@gmail.com> wrote: > hi Isaac -- you are more than welcome to submit a PR to cause unsigned > types to be written as signed integers when using flavor='spark' from > pyarrow. The simplest thing would be to do the casting of unsigned > types to signed prior to writing the Parquet file > > - Wes > > On Tue, Oct 29, 2019 at 9:09 AM Isaac Myers > isaacmy...@protonmail.com.invalid wrote: > > > > Fields with unsigned types written with Arrow C++ can't be read by PySpark, > > due to Spark's lack of support unsigned types (per > > https://issues.apache.org/jira/browse/SPARK-10113). There is already an > > issue to address the same problem when writing tables with unsigned fields > > using PyArrow (https://issues.apache.org/jira/browse/ARROW-1988). Are there > > any plans to address this issue for Arrow C++? > > Sent with ProtonMail Secure Email.