I filed an issue in the datafusion repo as well, since not everyone is on the mailing list.
https://github.com/apache/arrow-datafusion/issues/5601 On Tue, Mar 14, 2023 at 9:36 AM Andy Grove <andygrov...@gmail.com> wrote: > We have been using github-changelog-generator [1] to generate changelogs > for the Rust projects for some time now. It has served us well but is no > longer workable, at least for DataFusion. This tool seems to pull down the > entire project history using the GitHub API and we had to artificially slow > it down to avoid hitting API rate limits, and it is now unusable due to the > number of issues and PRs in this repo. > > This weekend, I built a much simpler changelog generator in Python [2], > that I am now using for the projects that I am the release manager for > (datafusion, datafusion-python, ballista). It has almost the same > functionality that we were getting from the previous generator, but takes > less than a minute to run, compared to 30+ minutes for the old generator. > It only hits the GitHub API for information about commits and pull requests > in the release being documented, rather than accessing the entire project > history. > > I followed the same approach of using GitHub labels to categorize PRs > (enhancements, bug fixes, docs, etc) but this requires a small amount of > manual effort to add those labels and re-generate the changelog. > > I noticed that some contributors are already prefixing PR titles with > "feat:", "feature:", "fix:", "docs:", etc. I plan on updating the changelog > generator to recognize these prefixes as well, to help automate my job. > > I wonder if it is worth formalizing these "semantic titles" more, and > maybe having CI enforce them. It would improve the quality of our > changelogs and reduce the burden on release managers. > > I would appreciate any feedback on this idea. > > Thanks, > > Andy. > > > [1] > https://github.com/github-changelog-generator/github-changelog-generator > [2] https://github.com/andygrove/changelog-genie >