I think we should allow multiple PRs for any issue in order to reduce friction for contributors.
I remembered this thread while looking at the issue for Decimal32/64 support [1] which I think is a good example of where filing separate issues for every patch doesn't add value and adds friction. Some patches associated with [1] were missed in the automatic changelog generation for Arrow 19.0.0. Thanks for bringing this discussion up. [1] https://github.com/apache/arrow/issues/43956 On Wed, Sep 11, 2024 at 8:32 AM Joris Van den Bossche <jorisvandenboss...@gmail.com> wrote: > > Hi all, > > This is a discussion specifically for the GitHub development workflow > we use in the monorepo, i.e. https://github.com/apache/arrow/ > > We have the unwritten(?) (but implicitly implied by our tooling) rule > that we always should have one issue for one PR to close that issue. > I would like to discuss expanding that to explicitly allow making > multiple PRs that link to the same issue. > > For clarity, I don't want to discuss the usefulness of actually having > an issue linked to a PR (we could discuss expanding the scope of our > "minor" PRs, but that's for a separate discussion I would say). > But in practice, you regularly want to split up the work related to > the same topic into multiple PRs (to have smaller PRs, to ease > reviewing, etc). At the moment, to follow our workflow, that requires > creating a bunch of dummy child issues just to have a unique issue > number to reference in each PR. While in practice they could all > reference the same issue number. This keeps the relevant information > more centralized in that one issue, and avoids the noise of a flood of > dummy issues in our issue list. > > Practical example: currently I am planning to work on adding type > annotations to the pyarrow library. I will probably split up that work > in a PR per module, but they can all reference a single parent issue > instead of also creating an issue about "adding type annotations in > module xxx" for each PR. > > --- > > I think this is perfectly possible with our current tooling, if we > want, with the following notes: > > - The current merge script will ask you to update (i.e. close) the > issue, and at that point if you know this is a parent issue you should > say "no" (or afterwards reopen the issue). > (we could also discuss whether we actually need this merge script, but > let's keep that for another thread? ;)) > > - The release notes generation currently relies on listing issues, and > not PRs. That means if you want the issue listed, it should be closed > (and tagged with that milestone) by the time of the release (if it is > ungoing work, you can at that point create a new issue for all PRs > going into the next release). > > - If a PR needs to be backported, that also depends on its connection > to and the milestone of the issue. Thus, for PRs that need to be > backported, you should always open a unique issue and it should not > reference an issue tracking multiple PRs. > > Thoughts? Concerns with allowing this? > > Best, > Joris