kevinjqliu opened a new issue, #1217:
URL: https://github.com/apache/datafusion-python/issues/1217

   **Describe the bug**
   
   Found an issue with datafusion 49 when used with pyiceberg and 
pyiceberg_core (which uses the datafusion rust crate). [This test 
`test_datafusion_register_pyiceberg_table`](https://github.com/apache/iceberg-python/blob/19ba343a2e16823e79f0c0359411f8bfd792ea6e/tests/table/test_datafusion.py#L43-L64)
 fails when using datafusion==49 but succeed with datafusion 48,47,46
   
   I'll find a better way to reproduce, but right now heres one way with the 
pyiceberg library
   ```
   make install 
   poetry run pip install datafusion==49
   poetry run pytest 
tests/table/test_datafusion.py::test_datafusion_register_pyiceberg_table
   ```
   
   For context, there are 3 libraries involved
   - datafusion
   - [pyiceberg, which currently uses 
datafusion==47](https://github.com/apache/iceberg-python/blob/19ba343a2e16823e79f0c0359411f8bfd792ea6e/poetry.lock#L1294-L1309)
   - pyiceberg_core==0.5.1, which pyiceberg pulls in as dependency for 
datafusion TableProvider. [pyiceberg_core==0.5.1 uses datafusion 
47](https://github.com/apache/iceberg-rust/blob/80feb07423d8c81b055c03fb577ad0e95a55df62/Cargo.toml#L63)
   
   
   **To Reproduce**
   Steps to reproduce the behavior:
   
   **Expected behavior**
   A clear and concise description of what you expected to happen.
   
   **Additional context**
   Add any other context about the problem here.
   
   Heres the stack trace
   ```
   ➜  iceberg-python git:(main) ✗ poetry run pytest 
tests/table/test_datafusion.py::test_datafusion_register_pyiceberg_table
   ========================================================= test session 
starts ==========================================================
   platform darwin -- Python 3.12.11, pytest-7.4.4, pluggy-1.6.0
   rootdir: /Users/kevinliu/repos/iceberg-python
   configfile: pyproject.toml
   plugins: checkdocs-2.13.0, anyio-4.10.0, mock-3.14.1, lazy-fixture-0.6.3, 
requests-mock-1.12.1
   collected 1 item                                                             
                                                          
   
   tests/table/test_datafusion.py Fatal Python error: Bus error
   
   Thread 0x000000016cf3b000 (most recent call first):
     File 
"/Users/kevinliu/.pyenv/versions/3.12.11/lib/python3.12/concurrent/futures/thread.py",
 line 90 in _worker
     File 
"/Users/kevinliu/.pyenv/versions/3.12.11/lib/python3.12/threading.py", line 
1012 in run
     File 
"/Users/kevinliu/.pyenv/versions/3.12.11/lib/python3.12/threading.py", line 
1075 in _bootstrap_inner
     File 
"/Users/kevinliu/.pyenv/versions/3.12.11/lib/python3.12/threading.py", line 
1032 in _bootstrap
   
   Thread 0x000000016bf2f000 (most recent call first):
     File 
"/Users/kevinliu/.pyenv/versions/3.12.11/lib/python3.12/concurrent/futures/thread.py",
 line 90 in _worker
     File 
"/Users/kevinliu/.pyenv/versions/3.12.11/lib/python3.12/threading.py", line 
1012 in run
     File 
"/Users/kevinliu/.pyenv/versions/3.12.11/lib/python3.12/threading.py", line 
1075 in _bootstrap_inner
     File 
"/Users/kevinliu/.pyenv/versions/3.12.11/lib/python3.12/threading.py", line 
1032 in _bootstrap
   
   Current thread 0x00000001ef2620c0 (most recent call first):
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/datafusion/dataframe.py",
 line 1019 in to_arrow_table
     File 
"/Users/kevinliu/repos/iceberg-python/tests/table/test_datafusion.py", line 57 
in test_datafusion_register_pyiceberg_table
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/_pytest/python.py",
 line 194 in pytest_pyfunc_call
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/pluggy/_callers.py",
 line 121 in _multicall
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/pluggy/_manager.py",
 line 120 in _hookexec
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/pluggy/_hooks.py",
 line 512 in __call__
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/_pytest/python.py",
 line 1792 in runtest
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/_pytest/runner.py",
 line 169 in pytest_runtest_call
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/pluggy/_callers.py",
 line 121 in _multicall
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/pluggy/_manager.py",
 line 120 in _hookexec
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/pluggy/_hooks.py",
 line 512 in __call__
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/_pytest/runner.py",
 line 262 in <lambda>
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/_pytest/runner.py",
 line 341 in from_call
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/_pytest/runner.py",
 line 261 in call_runtest_hook
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/_pytest/runner.py",
 line 222 in call_and_report
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/_pytest/runner.py",
 line 133 in runtestprotocol
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/_pytest/runner.py",
 line 114 in pytest_runtest_protocol
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/pluggy/_callers.py",
 line 121 in _multicall
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/pluggy/_manager.py",
 line 120 in _hookexec
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/pluggy/_hooks.py",
 line 512 in __call__
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/_pytest/main.py",
 line 350 in pytest_runtestloop
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/pluggy/_callers.py",
 line 121 in _multicall
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/pluggy/_manager.py",
 line 120 in _hookexec
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/pluggy/_hooks.py",
 line 512 in __call__
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/_pytest/main.py",
 line 325 in _main
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/_pytest/main.py",
 line 271 in wrap_session
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/_pytest/main.py",
 line 318 in pytest_cmdline_main
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/pluggy/_callers.py",
 line 121 in _multicall
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/pluggy/_manager.py",
 line 120 in _hookexec
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/pluggy/_hooks.py",
 line 512 in __call__
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/_pytest/config/__init__.py",
 line 169 in main
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/_pytest/config/__init__.py",
 line 192 in console_main
     File 
"/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/bin/pytest",
 line 8 in <module>
   
   Extension modules: zstandard.backend_c, charset_normalizer.md, 
requests.packages.charset_normalizer.md, requests.packages.chardet.md, 
yaml._yaml, markupsafe._speedups, mmh3, numpy._core._multiarray_umath, 
numpy.linalg._umath_linalg, pyarrow.lib, sqlalchemy.cyextension.collections, 
sqlalchemy.cyextension.immutabledict, sqlalchemy.cyextension.processors, 
sqlalchemy.cyextension.resultproxy, sqlalchemy.cyextension.util, 
numpy.random._common, numpy.random.bit_generator, 
numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, 
numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, 
numpy.random._generator, pandas._libs.tslibs.ccalendar, 
pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, 
pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, 
pandas._libs.tslibs.timezones, pandas._libs.tslibs.fields, 
pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.tzconversion, 
pandas._libs.tslibs.timestamps, pandas._libs.properties, 
pandas._libs.tslibs.offsets, pa
 ndas._libs.tslibs.strptime, pandas._libs.tslibs.parsing, 
pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, 
pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, 
pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, 
pandas._libs.interval, pandas._libs.lib, pyarrow._compute, pandas._libs.ops, 
pandas._libs.hashing, pandas._libs.arrays, pandas._libs.tslib, 
pandas._libs.sparse, pandas._libs.internals, pandas._libs.indexing, 
pandas._libs.index, pandas._libs.writers, pandas._libs.join, 
pandas._libs.window.aggregations, pandas._libs.window.indexers, 
pandas._libs.reshape, pandas._libs.groupby, pandas._libs.json, 
pandas._libs.parsers, pandas._libs.testing, pyarrow._acero, pyarrow._fs, 
pyarrow._csv, pyarrow._json, pyarrow._substrait, pyarrow._dataset, 
pyarrow._dataset_orc, pyarrow._parquet, pyarrow._parquet_encryption, 
pyarrow._dataset_parquet_encryption, pyarrow._dataset_parquet, 
pyarrow._azurefs, pyarrow._hdfs, pyarrow._gcsfs, pyarrow._s3fs, pyroaring 
(total: 81)
   [1]    95409 bus error  poetry run pytest 
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to