Working through 3 pages of the oldest open issues, I recommend these 14 to be considered for closure.
Because it’s obsolete or has been resolved: https://github.com/apache/datafusion/issues/146 (Add optional rust features for functions in library to keep dependencies down) https://github.com/apache/datafusion/issues/693 (Using string datatype from python raises "Exception: The type 13 is not valid”) https://github.com/apache/datafusion/issues/382 (Move filter_push_down::split_members to be reused outside of DataFusion) https://github.com/apache/datafusion/issues/1060 (Add support of HDFS as remote object store) https://github.com/apache/datafusion/issues/1007 ([Python]: Expose Dataframe.schema() to Python binding) https://github.com/apache/datafusion/issues/955 (Evaluate pyo3 abi3 wheel limitations for datafusion python binding) https://github.com/apache/datafusion/issues/949 (Creating dataframe with Recordbatch using pyarrow.Table.to_batches gives "type16 not valid error" when schema includes date32[day] type) https://github.com/apache/datafusion/issues/839 (Refactor the hash_aggregate) Because it relates to arrow-rs, datafusion-python, or Ballista, which are now eternal git repos: https://github.com/apache/datafusion/issues/111 (from JIRA: Tracking issue for big endian platforms) https://github.com/apache/datafusion/issues/196 (from JIRA: Review the contract between DataFusion and Arrow) https://github.com/apache/datafusion/issues/491 ([Python] Support pathlib.Path arguments for ExecutionContext.register_* methods) https://github.com/apache/datafusion/issues/472 ([Ballista] Improve task and job metadata) Should be closed due to existing comments: https://github.com/apache/datafusion/issues/179 (Move SortExec partition check to constructor) Should be closed because answers to questions were not provided, or no reproducer: https://github.com/apache/datafusion/issues/563 (switch to using AsyncBencher for datafusion benches?) Regards, David