It's time again to provide another board report. As a reminder, the primary reason for these reports is for the board to make sure the project is ok. The actual project details are more likely helpful for other members of the project than the board.
With that being said, please feel free to provide your comments on the ticket[1], google doc[2] or reply to this email and I will incorporate them I plan to submit this to the board on December 11, 2024 Thanks, Andrew [1]: https://github.com/apache/datafusion/issues/10157 [2]: https://docs.google.com/document/d/1b_C8uwMJVSrw9N1Oc8_fzFdpT0YExaRiuXJ8ulAXaYs ------ ## Description: The mission of Apache DataFusion is the creation and maintenance of software related to an extensible query engine ## Project Status: Current project status: New + Ongoing (high activity) Issues for the board: None ## Membership Data: Apache DataFusion was founded 2024-04-16 (8 months ago) There are currently 40 committers and 14 PMC members in this project. The Committer-to-PMC ratio is roughly 5:2. Community changes, past quarter: - No new PMC members. Last addition was Jay Zhan on 2024-08-11. - Jax Liu was added as committer on 2024-10-18 - Ifeanyi Ubah was added as committer on 2024-11-04 - Michael Ward was added as committer on 2024-09-13 - Tim Saucer was added as committer on 2024-09-07 ## Project Activity: We have completed adopting [sqlparser crate] into the project and made our first release [sqlparser crate]: https://github.com/apache/datafusion-sqlparser-rs ### DataFusion core https://github.com/apache/datafusion We continue the monthly release cadence versions. The [42.0.0 release] and [43.0.0 release] had 73 and 96 unique contributors. We continue to [discuss the roadmap] in the open, and have gathered a collection of [DataFusion related articles] onto our page. We recently finished [significant performance improvements] as well as long standing projects to migrate documentation to code and use the same API for all user defined window functions. We also added FFI bindings to make it easier to use multiple versions of DataFusion. As more people build systems using DataFusion it is likely we will begin to focus more on keeping the core more stable, as it is [sometimes painful] to update to new DataFusion versions. [42.0.0 release]: https://github.com/apache/datafusion/blob/main/dev/changelog/42.0.0.md [43.0.0 release]: https://github.com/apache/datafusion/blob/main/dev/changelog/42.0.0.md [roadmap ticket]: https://github.com/apache/datafusion/issues/11442 [discuss the roadmap]: https://github.com/apache/datafusion/issues/13274 [DataFusion related articles]: https://datafusion.apache.org/user-guide/concepts-readings-events.html [significant performance improvements]: https://datafusion.apache.org/blog/2024/11/18/datafusion-fastest-single-node-parquet-clickbench/ [sometimes painful]: https://github.com/apache/datafusion/issues/13525 ### Sub project: DataFusion Python https://github.com/apache/datafusion-python TODO UPDATE Sep 2024 Update: The DataFusion Python project has received significant contributions recently to make the project more “Pythonic” and now has regular activity from maintainers. Tim Saucer has been added as a committer who focuses more heavily on datafusion-python. ### Sub project: DataFusion Comet https://github.com/apache/datafusion-comet TODO UPDATE Sep 2024 Update: The Comet project is very active and recently released its initial 0.1.0 source release. Blog post: https://datafusion.apache.org/blog/2024/07/20/datafusion-comet-0.1.0/ ### Sub project: DataFusion Ballista TODO UPDATE Sep 2024 Update: https://github.com/apache/datafusion-ballista https://github.com/apache/datafusion-ballista-python The Ballista subproject is not very actively maintained, but there have been some contributions recently to upgrade to more recent versions of the core DataFusion project. ### Sub project: Sqlparser https://github.com/apache/datafusion-sqlparser-rs The sqlparser project became part of DataFusion this quarter. In addition to ongoing additions of SQL dialect support, we made our first release as part of the Apache DataFusion project, and have started introducing spans (source locations), a long requested feature ### Recent Releases * COMET-0.4.0 was released on 2024-11-18. * SQLPARSER-0.52.0 was released on 2024-11-11. * 43.0.0 was released on 2024-11-08. * 42.2.0 was released on 2024-11-04. * 42.1.0 was released on 2024-10-20. * PYTHON-42.0.0 was released on 2024-10-11. * COMET-0.3.0 was released on 2024-09-28. * 42.0.0 was released on 2024-09-17. ## Community Health: It is still hard to keep track of everything going on, which is a good thing. While it is always a struggle to get enough code review capacity, the committers keep things going and the community helps each other out with reviews. We continue to actively grow our committer and PMC ranks. We have upcoming meetups scheduled for Chicago, Boston, and Amsterdam.