hi Joris,

Thanks for investigating this. It seems there were some unintended
consequences of the zero-copy optimizations from ARROW-3789. Another
way forward might be to "opt in" to this behavior, or to only do the
zero copy optimizations when split_blocks=True. What do you think?

- Wes

On Thu, Jan 16, 2020 at 3:42 AM Joris Van den Bossche
<jorisvandenboss...@gmail.com> wrote:
>
> So the spark integration build started to fail, and with the following test
> error:
>
> ======================================================================
> ERROR: test_toPandas_batch_order
> (pyspark.sql.tests.test_arrow.EncryptionArrowTests)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/spark/python/pyspark/sql/tests/test_arrow.py", line 422, in
> test_toPandas_batch_order
>     run_test(*case)
>   File "/spark/python/pyspark/sql/tests/test_arrow.py", line 409, in run_test
>     pdf, pdf_arrow = self._toPandas_arrow_toggle(df)
>   File "/spark/python/pyspark/sql/tests/test_arrow.py", line 152, in
> _toPandas_arrow_toggle
>     pdf_arrow = df.toPandas()
>   File "/spark/python/pyspark/sql/pandas/conversion.py", line 115, in toPandas
>     return _check_dataframe_localize_timestamps(pdf, timezone)
>   File "/spark/python/pyspark/sql/pandas/types.py", line 180, in
> _check_dataframe_localize_timestamps
>     pdf[column] = _check_series_localize_timestamps(series, timezone)
>   File 
> "/opt/conda/envs/arrow/lib/python3.7/site-packages/pandas/core/frame.py",
> line 3487, in __setitem__
>     self._set_item(key, value)
>   File 
> "/opt/conda/envs/arrow/lib/python3.7/site-packages/pandas/core/frame.py",
> line 3565, in _set_item
>     NDFrame._set_item(self, key, value)
>   File 
> "/opt/conda/envs/arrow/lib/python3.7/site-packages/pandas/core/generic.py",
> line 3381, in _set_item
>     self._data.set(key, value)
>   File 
> "/opt/conda/envs/arrow/lib/python3.7/site-packages/pandas/core/internals/managers.py",
> line 1090, in set
>     blk.set(blk_locs, value_getitem(val_locs))
>   File 
> "/opt/conda/envs/arrow/lib/python3.7/site-packages/pandas/core/internals/blocks.py",
> line 380, in set
>     self.values[locs] = values
> ValueError: assignment destination is read-only
>
>
> It's from a test that is doing conversions from spark to arrow to pandas
> (so calling pyarrow.Table.to_pandas here
> <https://github.com/apache/spark/blob/018bdcc53c925072b07956de0600452ad255b9c7/python/pyspark/sql/pandas/conversion.py#L111-L115>),
> and on the resulting DataFrame, it is iterating through all columns,
> potentially fixing timezones, and writing each column back into the
> DataFrame (here
> <https://github.com/apache/spark/blob/018bdcc53c925072b07956de0600452ad255b9c7/python/pyspark/sql/pandas/types.py#L179-L181>
> ).
>
> Since it is giving an error about read-only, it might be related to
> zero-copy behaviour of to_pandas, and thus might be related to the refactor
> of the arrow->pandas conversion that landed yesterday (
> https://github.com/apache/arrow/pull/6067, it says it changed to do
> zero-copy for 1-column blocks if possible).
> I am not sure if something should be fixed in pyarrow for this, but the
> obvious thing that pyspark can do is specify they don't want zero-copy.
>
> Joris
>
> On Wed, 15 Jan 2020 at 14:32, Crossbow <cross...@ursalabs.org> wrote:
>
> >
> > Arrow Build Report for Job nightly-2020-01-15-0
> >
> > All tasks:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0
> >
> > Failed Tasks:
> > - gandiva-jar-osx:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-travis-gandiva-jar-osx
> > - test-conda-python-3.7-spark-master:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-conda-python-3.7-spark-master
> > - wheel-manylinux2014-cp35m:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-wheel-manylinux2014-cp35m
> >
> > Succeeded Tasks:
> > - centos-6:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-centos-6
> > - centos-7:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-centos-7
> > - centos-8:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-centos-8
> > - conda-linux-gcc-py27:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-conda-linux-gcc-py27
> > - conda-linux-gcc-py36:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-conda-linux-gcc-py36
> > - conda-linux-gcc-py37:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-conda-linux-gcc-py37
> > - conda-linux-gcc-py38:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-conda-linux-gcc-py38
> > - conda-osx-clang-py27:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-conda-osx-clang-py27
> > - conda-osx-clang-py36:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-conda-osx-clang-py36
> > - conda-osx-clang-py37:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-conda-osx-clang-py37
> > - conda-osx-clang-py38:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-conda-osx-clang-py38
> > - conda-win-vs2015-py36:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-conda-win-vs2015-py36
> > - conda-win-vs2015-py37:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-conda-win-vs2015-py37
> > - conda-win-vs2015-py38:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-conda-win-vs2015-py38
> > - debian-buster:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-debian-buster
> > - debian-stretch:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-debian-stretch
> > - gandiva-jar-trusty:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-travis-gandiva-jar-trusty
> > - homebrew-cpp:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-travis-homebrew-cpp
> > - macos-r-autobrew:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-travis-macos-r-autobrew
> > - test-conda-cpp:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-conda-cpp
> > - test-conda-python-2.7-pandas-latest:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-conda-python-2.7-pandas-latest
> > - test-conda-python-2.7:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-conda-python-2.7
> > - test-conda-python-3.6:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-conda-python-3.6
> > - test-conda-python-3.7-dask-latest:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-conda-python-3.7-dask-latest
> > - test-conda-python-3.7-hdfs-2.9.2:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-conda-python-3.7-hdfs-2.9.2
> > - test-conda-python-3.7-pandas-latest:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-conda-python-3.7-pandas-latest
> > - test-conda-python-3.7-pandas-master:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-conda-python-3.7-pandas-master
> > - test-conda-python-3.7-turbodbc-latest:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-conda-python-3.7-turbodbc-latest
> > - test-conda-python-3.7-turbodbc-master:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-conda-python-3.7-turbodbc-master
> > - test-conda-python-3.7:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-conda-python-3.7
> > - test-conda-python-3.8-dask-master:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-conda-python-3.8-dask-master
> > - test-conda-python-3.8-pandas-latest:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-conda-python-3.8-pandas-latest
> > - test-conda-r-3.6:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-conda-r-3.6
> > - test-debian-10-cpp:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-debian-10-cpp
> > - test-debian-10-go-1.12:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-debian-10-go-1.12
> > - test-debian-10-python-3:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-debian-10-python-3
> > - test-debian-c-glib:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-debian-c-glib
> > - test-debian-ruby:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-debian-ruby
> > - test-fedora-29-cpp:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-fedora-29-cpp
> > - test-fedora-29-python-3:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-fedora-29-python-3
> > - test-r-rhub-debian-gcc-devel:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-test-r-rhub-debian-gcc-devel
> > - test-r-rhub-ubuntu-gcc-release:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-test-r-rhub-ubuntu-gcc-release
> > - test-r-rstudio-r-base-3.6-bionic:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-test-r-rstudio-r-base-3.6-bionic
> > - test-r-rstudio-r-base-3.6-centos6:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-test-r-rstudio-r-base-3.6-centos6
> > - test-r-rstudio-r-base-3.6-opensuse15:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-test-r-rstudio-r-base-3.6-opensuse15
> > - test-r-rstudio-r-base-3.6-opensuse42:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-test-r-rstudio-r-base-3.6-opensuse42
> > - test-ubuntu-16.04-cpp:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-ubuntu-16.04-cpp
> > - test-ubuntu-18.04-cpp-cmake32:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-ubuntu-18.04-cpp-cmake32
> > - test-ubuntu-18.04-cpp-release:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-ubuntu-18.04-cpp-release
> > - test-ubuntu-18.04-cpp-static:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-ubuntu-18.04-cpp-static
> > - test-ubuntu-18.04-cpp:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-ubuntu-18.04-cpp
> > - test-ubuntu-18.04-docs:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-ubuntu-18.04-docs
> > - test-ubuntu-18.04-python-3:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-ubuntu-18.04-python-3
> > - test-ubuntu-18.04-r-sanitizer:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-ubuntu-18.04-r-sanitizer
> > - test-ubuntu-c-glib:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-ubuntu-c-glib
> > - test-ubuntu-fuzzit-fuzzing:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-ubuntu-fuzzit-fuzzing
> > - test-ubuntu-fuzzit-regression:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-ubuntu-fuzzit-regression
> > - test-ubuntu-ruby:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-circle-test-ubuntu-ruby
> > - ubuntu-bionic:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-ubuntu-bionic
> > - ubuntu-disco:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-ubuntu-disco
> > - ubuntu-xenial:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-ubuntu-xenial
> > - wheel-manylinux1-cp27m:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-wheel-manylinux1-cp27m
> > - wheel-manylinux1-cp27mu:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-wheel-manylinux1-cp27mu
> > - wheel-manylinux1-cp35m:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-wheel-manylinux1-cp35m
> > - wheel-manylinux1-cp36m:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-wheel-manylinux1-cp36m
> > - wheel-manylinux1-cp37m:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-wheel-manylinux1-cp37m
> > - wheel-manylinux1-cp38:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-wheel-manylinux1-cp38
> > - wheel-manylinux2010-cp27m:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-wheel-manylinux2010-cp27m
> > - wheel-manylinux2010-cp27mu:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-wheel-manylinux2010-cp27mu
> > - wheel-manylinux2010-cp35m:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-wheel-manylinux2010-cp35m
> > - wheel-manylinux2010-cp36m:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-wheel-manylinux2010-cp36m
> > - wheel-manylinux2010-cp37m:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-wheel-manylinux2010-cp37m
> > - wheel-manylinux2010-cp38:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-wheel-manylinux2010-cp38
> > - wheel-manylinux2014-cp36m:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-wheel-manylinux2014-cp36m
> > - wheel-manylinux2014-cp37m:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-wheel-manylinux2014-cp37m
> > - wheel-manylinux2014-cp38:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-azure-wheel-manylinux2014-cp38
> > - wheel-osx-cp27m:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-travis-wheel-osx-cp27m
> > - wheel-osx-cp35m:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-travis-wheel-osx-cp35m
> > - wheel-osx-cp36m:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-travis-wheel-osx-cp36m
> > - wheel-osx-cp37m:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-travis-wheel-osx-cp37m
> > - wheel-osx-cp38:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-travis-wheel-osx-cp38
> > - wheel-win-cp36m:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-appveyor-wheel-win-cp36m
> > - wheel-win-cp37m:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-appveyor-wheel-win-cp37m
> > - wheel-win-cp38:
> >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-15-0-appveyor-wheel-win-cp38
> >

Reply via email to