It is time again for our monthly ASF board report.

This month marks the last of the monthly reports required of new top level
projects, so this activity will happen less frequently after this month.

Please provide your comments on the ticket[1], google doc[2] or reply to
this email

I plan to submit this to the board on July 10

Thanks,
Andrew

[1]: https://github.com/apache/datafusion/issues/10282
[2]:
https://docs.google.com/document/d/1lV-cFZGHCSrTiaLW1gyEMDKW-9nf47UW8xK19QCqbVk/edit



----


## Description:
The mission of Apache DataFusion is the creation and maintenance of
software
related to an extensible query engine

## Project Status:
Current project status: New + Ongoing (high activity)
Issues for the board: None


## Membership Data:
Apache DataFusion was founded 2024-04-16 (3 months ago)
There are currently 33 committers and 13 PMC members in this project.
The Committer-to-PMC ratio is roughly 9:4.

Community changes, past month:
- Mehmet Ozan Kabak was added to the PMC on 2024-06-12
- Ruihang Xia was added to the PMC on 2024-06-12
- Lewis Zhang was added as committer on 2024-06-14



## Project Activity:

The project continues to be quite active with many PRs and issues opened and
closed per day.

We started working on a project blog [1] (previously we used the arrow
blog) and hope to have our first blog post as an independent project later
this month.

There was a well attended face to face meetup in San Francisco, CA USA in
June [2]. We have one planned for Hangzhou, China in July[3]. There appears
significant interest for this and there are more planned

[1]: https://datafusion.apache.org/blog/
[2]: https://github.com/apache/datafusion/discussions/10800
[3]:
https://github.com/apache/datafusion/discussions/10341#discussioncomment-9738748

### DataFusion core
https://github.com/apache/datafusion

We released version 39.0.0, continuing our schedule of monthly releases

Some projects we have been working on recently on more flexible use of
Parquet files including indexing and extracting statistics. We are also
working with the community to make extending SQL planning[2] easier and
extending file format support[3], as well as fixing bugs found with a SQL
fuzzer[4], and improving performance with StringView[5]

It has been nice to see several good examples of cross contributor/company
collaboration such as [6] and [7].

We have also been making external presentations[1]

[1]: https://github.com/apache/datafusion/issues/10969
[2]: https://github.com/apache/datafusion/issues/10534
[3]: https://github.com/apache/datafusion/pull/11060
[4]: https://github.com/apache/datafusion/issues/11030
[5]: https://github.com/apache/datafusion/issues/10918
[6]: https://github.com/apache/datafusion/pull/11203
[7]: https://github.com/apache/datafusion/issues/10534

### Sub project: DataFusion Python
https://github.com/apache/datafusion-python

TODO


### Sub project: DataFusion Comet
https://github.com/apache/datafusion-comet

TODO


### Sub project: DataFusion Ballista
https://github.com/apache/datafusion-ballista
https://github.com/apache/datafusion-ballista-python

The Ballista subproject is not currently actively maintained.

### Recent Releases
* PYTHON-38.0.1 was released on 2024-05-30.
* PYTHON-37.1.0 was released on 2024-05-13.
* 38.0.0 was released on 2024-05-10.

## Community Health:

Most of our communications still happen through github.

We are also discussing [1] adopting a library that makes it easier to write
UDF code

[1]: https://github.com/apache/datafusion/discussions/11192

TODO update

* dev@datafusion.apache.org had a big increase in traffic in the past
quarter
  (71 emails compared to 0)
* git...@datafusion.apache.org had a big increase in traffic in the past
  quarter (7685 emails compared to 0)

Reply via email to