Hi, PyArrow throws an exception when reading Parquet file generated from the version 2.0 of Parquet writer in Hive:
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python3.7/site-packages/pyarrow/parquet.py", line 1732, in read_table use_pandas_metadata=use_pandas_metadata) File "/usr/local/lib/python3.7/site-packages/pyarrow/parquet.py", line 1610, in read use_threads=use_threads File "pyarrow/_dataset.pyx", line 458, in pyarrow._dataset.Dataset.to_table File "pyarrow/_dataset.pyx", line 2889, in pyarrow._dataset.Scanner.to_table File "pyarrow/error.pxi", line 141, in pyarrow.lib.pyarrow_internal_check_status File "pyarrow/error.pxi", line 112, in pyarrow.lib.check_status OSError: Not yet implemented: Unsupported encoding. I notice that there are several unsupported encodings as described in: https://arrow.apache.org/docs/cpp/parquet.html#encodings Is there any plan to support these encodings in the near future? If not, I would like to try to implement it by myself. Any advice would be appreciated! Best regards, Shan Huang