I'm with Chesnay on this issue.
Stale PRs, i.e., a PR where a contributor becomes inactive, are one of our
smallest issues, IMO.

There are more reasons for the high number of PRs.
* Lack of timely reviews.
* Not eagerly closing PRs that have no or very little chance of being
merged. Common reasons are
  1) lack of interest in the feature by committers,
  2) too extensive changes and hence time consuming reviews, or
  3) bad quality.

For 1), there are lots of older JIRA issues, that have low priority but
which are picked up by contributors. In the contribution guidelines we ask
contributors to let us know when they want to work on an issue. So far our
attitude has been, yes sure go ahead. I've seen very little attempts of
warning somebody to work on issues that won't be easily merged.
Once a PR has been opened, we should also be honest and let contributors
know that it has no chance or might take a while to get reviewed.
For 2) this is typically not so much of an issue if the feature is
interesting. However, if 1) and 2) meet, chances to get a change in drop
even more.

A common "strategy" to deal with PRs that fall into 1), 2), or 3) is to not
look at them or giving a shallow review.
Of course, contributors become unresponsive if we don't look at their PRs
for weeks or months. But that's not their fault.
Instead, I think we should be honest and communicate the chances of a PR
more clearly.

Browsing over the list of open PRs, I feel that most older PRs fall into
the category of low-priority (or even unwanted) features.
Bug fixes or features that the committers care about usually make it into
the code base.
In case a contributor becomes inactive, committers often take over an push
a contribution over the line.

So, I do not support an auto-close mechanism.
I think a better way to address the issue is better communication, more
eagerly closing PRs with no chance, and putting more reviewing effort.
IMO, we should only eagerly close PRs that have no chance of being merged.
PRs with low-prio features might be picked up later (for Flink 1.5, I
merged a contribution from PR #1990 after it was requested a few times by
users).

However, I think we could do a pass over the oldest PRs and check if we can
close a bunch.
There are quite a few contributions (many for flink-ml) that I don't see a
chance for getting merged.

Best, Fabian


-

2018-05-15 9:13 GMT+02:00 Chesnay Schepler <ches...@apache.org>:

> -1
>
> For clarification (since the original mail indicates otherwise), the
> number of pull requests that this would affect is fairly small.
> Only about 25-30% of all open PRs are blocked on the contributor, the
> remaining ones are actually blocked on the review.
> Thus is reject the premise that one has to search through that many PRs to
> find something to review, there is plenty.
>
> I believe it to be highly unfair for us to close PRs due to inactivity,
> when the primary cause has been /our /own inactivity.
> If a PR is opened and the first comment comes in 3 months late, then I
> don't blame the contributor for not responding.
> IMO we owe it to the contributor to evaluate their PR, and if necessary
> bring it to a merge-able state (to a certain extend).
>
> There's also the fact that closing these PRs outright would waste a lot of
> good contributions.
>
> Finally, this solution is overkill for what we want to achieve. If we want
> to make it easier to find PRs to review all we need is
> GitBox integration and tagging or PRs. That's it. We could have a /fully
> /tagged PR list /tomorrow/, if we really wanted to.
>
>
> On 15.05.2018 05:10, Ted Yu wrote:
>
>> bq. this pull request requires a review, please simply write any comment.
>>
>> Shouldn't the wording of such comment be known before hand ?
>>
>> Otherwise pull request waiting for committers' review may be
>> mis-classified.
>>
>> Cheers
>>
>> On Mon, May 14, 2018 at 7:59 PM, blues zheng <kisim...@163.com> wrote:
>>
>> +1 for the proposal.
>>>
>>>
>>> Best,
>>> blues
>>> On 05/14/2018 20:58, Ufuk Celebi wrote:
>>> Hey Piotr,
>>>
>>> thanks for bringing this up. I really like this proposal and also saw
>>> it work successfully at other projects. So +1 from my side.
>>>
>>> - I like the approach with a notification one week before
>>> automatically closing the PR
>>> - I think a bot will the best option as these kinds of things are
>>> usually followed enthusiastically in the beginning but eventually
>>> loose traction
>>>
>>> We can enable better integration with GitHub by using ASF GitBox
>>> (https://gitbox.apache.org/setup/) but we should discuss that in a
>>> separate thread.
>>>
>>> – Ufuk
>>>
>>> On Mon, May 14, 2018 at 12:04 PM, Piotr Nowojski
>>> <pi...@data-artisans.com> wrote:
>>>
>>>> Hey,
>>>>
>>>> We have lots of open pull requests and quite some of them are
>>>>
>>> stale/abandoned/inactive. Often such old PRs are impossible to merge due
>>> to
>>> conflicts and it’s easier to just abandon and rewrite them. Especially
>>> there are some PRs which original contributor created long time ago,
>>> someone else wrote some comments/review and… that’s about it. Original
>>> contributor never shown up again to respond to the comments. Regardless
>>> of
>>> the reason such PRs are clogging the GitHub, making it difficult to keep
>>> track of things and making it almost impossible to find a little bit old
>>> (for example 3+ months) PRs that are still valid and waiting for reviews.
>>> To do something like that, one would have to dig through tens or hundreds
>>> of abandoned PRs.
>>>
>>>> What I would like to propose is to agree on some inactivity dead line,
>>>>
>>> lets say 3 months. After crossing such deadline, PRs should be
>>> marked/commented as “stale”, with information like:
>>>
>>>> “This pull request has been marked as stale due to 3 months of
>>>>
>>> inactivity. It will be closed in 1 week if no further activity occurs. If
>>> you think that’s incorrect or this pull request requires a review, please
>>> simply write any comment.”
>>>
>>>> Either we could just agree on such policy and enforce it manually (maybe
>>>>
>>> with some simple tooling, like a simple script to list inactive PRs -
>>> seems
>>> like couple of lines in python by using PyGithub) or we could think about
>>> automating this action. There are some bots that do exactly this (like
>>> this
>>> one: https://github.com/probot/stale <https://github.com/probot/stale>
>>> ),
>>> but probably they would need to be adopted to limitations of our Apache
>>> repository (we can not add labels and we can not close the PRs via
>>> GitHub).
>>>
>>>> What do you think about it?
>>>>
>>>> Piotrek
>>>>
>>>
>
>

Reply via email to