I like Till's proposal to notify the participants on the PR to PTAL. But I would also suggest to auto-close when no action is taken, with a friendly reminder that PRs can be reopened anytime.
The current situation with 350 open PRs may send a signal to contributors that it may actually be too much hassle to get a change committed in Flink. Since the count keeps on rising, this is also not a past problem. Pruning inactive PRs may help, that will also give a more accurate picture of the lack of review bandwidth, if that is the root cause. Thanks, Thomas On Tue, May 15, 2018 at 8:24 AM, Ted Yu <yuzhih...@gmail.com> wrote: > How does the bot decide whether the PR is waiting for reviews or is being > abandoned by contributor ? > > How about letting the bot count the number of times contributor pings > committer(s) for review ? > When unanswered ping count crosses some threshold, say 3, the bot publishes > the JIRA and PR somewhere. > > Cheers > > On Tue, May 15, 2018 at 8:19 AM, Till Rohrmann <trohrm...@apache.org> > wrote: > > > I'm a bit torn here because I can see the pros and cons for both sides. > > > > Maybe a compromise could be to not have a closing but a monitoring bot > > which notifies us about inactive PRs. This could then trigger an > > investigation of the underlying problem and ultimately lead to a > conscious > > decision to close or keep the PR open. As such it is not strictly > necessary > > to have such a bot but it would at least remind us to make a decision > about > > older PRs with no activity. > > > > Cheers, > > Till > > > > On Tue, May 15, 2018 at 3:26 PM, Chesnay Schepler <ches...@apache.org> > > wrote: > > > > > /So far I did it twice for older PRs. In both cases I didn’t get any > > > response and I even forgot in which PRs I had asked this question, so > > now I > > > can not even close them :S/ > > > > > > To be honest this sounds more like an issue with how your organize your > > > work. No amount of closing PRs can fix that. > > > With GitBox we can assign reviewers to a PR, but I'm not sure whether > it > > > only allows committers to assign people. > > > Bookmarks or text files might help as well./ > > > / > > > > > > /Regarding only 30% blocked on contributor. I wonder what would be this > > > number if we tried to ask in the rest of old PRs “Hey, are you still > > > interested in reviewing/merging this?”. If old PR is waiting for a > > > reviewer for X months, it doesn’t mean that’s not abandoned. Even if it > > > doesn’t, reducing our overhead by those 30% of the PRs is something./ > > > > > > No doubt the number would be higher if we were to go back, but as i > > > explained earlier that is not a reason to close it. If a PR is > abandoned > > > because we messed up we should still /try /to get it in. > > > > > > /This is kind of whole point of what I was proposing. If the PR author > is > > > still there, and can respond/bump/interrupt the closing timeout, that’s > > > great. Gives us even more sense of urgency to review it./ > > > > > > Unfortunately knowing that it is more urgent is irrelevant, as we > > > currently don't have the manpower to review them. Reviving them now > would > > > serve no purpose. The alternative is to wait until more people become > > > active reviewers. > > > > > > /As a last resort, closing PR after timeout is not the end of the > world. > > > It always can be reopened./ > > > > > > Let's be realistic here, it will not be reopened. > > > > > > > > > On 15.05.2018 14:21, Piotr Nowojski wrote: > > > > > >> I agree that we have other, even more important, problems with > reviewing > > >> PR and community, but that shouldn’t block us from trying to clean up > > >> things a little bit and minimise the effort needed for reviewing PRs. > > Now > > >> before reviewing/picking older PRs I had to ask this “Hey, are you > still > > >> interested in merging this?” manually and wait for the response. If it > > >> doesn’t come, I have to remember to go back and close PR, which I of > > course > > >> forget to do. Bah, so far I did it twice for older PRs. In both cases > I > > >> didn’t get any response and I even forgot in which PRs I had asked > this > > >> question, so now I can not even close them :S Wasted effort and wasted > > time > > >> on context switching for me and for everyone else that will have to > > scroll > > >> pass or look on those PR to check their status. > > >> > > >> Keep in mind that I am not proposing to close the PR automatically > > >> straight on after 3 months of inactivity. Only after asking a question > > >> whether original contributor is still there and he is interested in > the > > PR > > >> to be reviewed. > > >> > > >> for Flink 1.5, I merged a contribution from PR #1990 after it was > > >>> requested a few times by users. > > >>> > > >> This is kind of whole point of what I was proposing. If the PR author > is > > >> still there, and can respond/bump/interrupt the closing timeout, > that’s > > >> great. Gives us even more sense of urgency to review it. On the other > > hand > > >> if there is no response (no interest from the author for whatever the > > >> reasons) and nobody so far has picked this PR to review/merge, what’s > > the > > >> point of keeping such PR open? As a last resort, closing PR after > > timeout > > >> is not the end of the world. It always can be reopened. > > >> > > >> Regarding only 30% blocked on contributor. I wonder what would be this > > >> number if we tried to ask in the rest of old PRs “Hey, are you still > > >> interested in reviewing/merging this?”. If old PR is waiting for a > > reviewer > > >> for X months, it doesn’t mean that’s not abandoned. Even if it > doesn’t, > > >> reducing our overhead by those 30% of the PRs is something. > > >> > > >> Piotrek > > >> > > >> On 15 May 2018, at 10:10, Fabian Hueske <fhue...@gmail.com> wrote: > > >>> > > >>> I'm with Chesnay on this issue. > > >>> Stale PRs, i.e., a PR where a contributor becomes inactive, are one > of > > >>> our > > >>> smallest issues, IMO. > > >>> > > >>> There are more reasons for the high number of PRs. > > >>> * Lack of timely reviews. > > >>> * Not eagerly closing PRs that have no or very little chance of being > > >>> merged. Common reasons are > > >>> 1) lack of interest in the feature by committers, > > >>> 2) too extensive changes and hence time consuming reviews, or > > >>> 3) bad quality. > > >>> > > >>> For 1), there are lots of older JIRA issues, that have low priority > but > > >>> which are picked up by contributors. In the contribution guidelines > we > > >>> ask > > >>> contributors to let us know when they want to work on an issue. So > far > > >>> our > > >>> attitude has been, yes sure go ahead. I've seen very little attempts > of > > >>> warning somebody to work on issues that won't be easily merged. > > >>> Once a PR has been opened, we should also be honest and let > > contributors > > >>> know that it has no chance or might take a while to get reviewed. > > >>> For 2) this is typically not so much of an issue if the feature is > > >>> interesting. However, if 1) and 2) meet, chances to get a change in > > drop > > >>> even more. > > >>> > > >>> A common "strategy" to deal with PRs that fall into 1), 2), or 3) is > to > > >>> not > > >>> look at them or giving a shallow review. > > >>> Of course, contributors become unresponsive if we don't look at their > > PRs > > >>> for weeks or months. But that's not their fault. > > >>> Instead, I think we should be honest and communicate the chances of a > > PR > > >>> more clearly. > > >>> > > >>> Browsing over the list of open PRs, I feel that most older PRs fall > > into > > >>> the category of low-priority (or even unwanted) features. > > >>> Bug fixes or features that the committers care about usually make it > > into > > >>> the code base. > > >>> In case a contributor becomes inactive, committers often take over an > > >>> push > > >>> a contribution over the line. > > >>> > > >>> So, I do not support an auto-close mechanism. > > >>> I think a better way to address the issue is better communication, > more > > >>> eagerly closing PRs with no chance, and putting more reviewing > effort. > > >>> IMO, we should only eagerly close PRs that have no chance of being > > >>> merged. > > >>> PRs with low-prio features might be picked up later (for Flink 1.5, I > > >>> merged a contribution from PR #1990 after it was requested a few > times > > by > > >>> users). > > >>> > > >>> However, I think we could do a pass over the oldest PRs and check if > we > > >>> can > > >>> close a bunch. > > >>> There are quite a few contributions (many for flink-ml) that I don't > > see > > >>> a > > >>> chance for getting merged. > > >>> > > >>> Best, Fabian > > >>> > > >>> > > >>> - > > >>> > > >>> 2018-05-15 9:13 GMT+02:00 Chesnay Schepler <ches...@apache.org>: > > >>> > > >>> -1 > > >>>> > > >>>> For clarification (since the original mail indicates otherwise), the > > >>>> number of pull requests that this would affect is fairly small. > > >>>> Only about 25-30% of all open PRs are blocked on the contributor, > the > > >>>> remaining ones are actually blocked on the review. > > >>>> Thus is reject the premise that one has to search through that many > > PRs > > >>>> to > > >>>> find something to review, there is plenty. > > >>>> > > >>>> I believe it to be highly unfair for us to close PRs due to > > inactivity, > > >>>> when the primary cause has been /our /own inactivity. > > >>>> If a PR is opened and the first comment comes in 3 months late, > then I > > >>>> don't blame the contributor for not responding. > > >>>> IMO we owe it to the contributor to evaluate their PR, and if > > necessary > > >>>> bring it to a merge-able state (to a certain extend). > > >>>> > > >>>> There's also the fact that closing these PRs outright would waste a > > lot > > >>>> of > > >>>> good contributions. > > >>>> > > >>>> Finally, this solution is overkill for what we want to achieve. If > we > > >>>> want > > >>>> to make it easier to find PRs to review all we need is > > >>>> GitBox integration and tagging or PRs. That's it. We could have a > > /fully > > >>>> /tagged PR list /tomorrow/, if we really wanted to. > > >>>> > > >>>> > > >>>> On 15.05.2018 05:10, Ted Yu wrote: > > >>>> > > >>>> bq. this pull request requires a review, please simply write any > > >>>>> comment. > > >>>>> > > >>>>> Shouldn't the wording of such comment be known before hand ? > > >>>>> > > >>>>> Otherwise pull request waiting for committers' review may be > > >>>>> mis-classified. > > >>>>> > > >>>>> Cheers > > >>>>> > > >>>>> On Mon, May 14, 2018 at 7:59 PM, blues zheng <kisim...@163.com> > > wrote: > > >>>>> > > >>>>> +1 for the proposal. > > >>>>> > > >>>>>> > > >>>>>> Best, > > >>>>>> blues > > >>>>>> On 05/14/2018 20:58, Ufuk Celebi wrote: > > >>>>>> Hey Piotr, > > >>>>>> > > >>>>>> thanks for bringing this up. I really like this proposal and also > > saw > > >>>>>> it work successfully at other projects. So +1 from my side. > > >>>>>> > > >>>>>> - I like the approach with a notification one week before > > >>>>>> automatically closing the PR > > >>>>>> - I think a bot will the best option as these kinds of things are > > >>>>>> usually followed enthusiastically in the beginning but eventually > > >>>>>> loose traction > > >>>>>> > > >>>>>> We can enable better integration with GitHub by using ASF GitBox > > >>>>>> (https://gitbox.apache.org/setup/) but we should discuss that in > a > > >>>>>> separate thread. > > >>>>>> > > >>>>>> – Ufuk > > >>>>>> > > >>>>>> On Mon, May 14, 2018 at 12:04 PM, Piotr Nowojski > > >>>>>> <pi...@data-artisans.com> wrote: > > >>>>>> > > >>>>>> Hey, > > >>>>>>> > > >>>>>>> We have lots of open pull requests and quite some of them are > > >>>>>>> > > >>>>>>> stale/abandoned/inactive. Often such old PRs are impossible to > > merge > > >>>>>> due > > >>>>>> to > > >>>>>> conflicts and it’s easier to just abandon and rewrite them. > > Especially > > >>>>>> there are some PRs which original contributor created long time > ago, > > >>>>>> someone else wrote some comments/review and… that’s about it. > > Original > > >>>>>> contributor never shown up again to respond to the comments. > > >>>>>> Regardless > > >>>>>> of > > >>>>>> the reason such PRs are clogging the GitHub, making it difficult > to > > >>>>>> keep > > >>>>>> track of things and making it almost impossible to find a little > bit > > >>>>>> old > > >>>>>> (for example 3+ months) PRs that are still valid and waiting for > > >>>>>> reviews. > > >>>>>> To do something like that, one would have to dig through tens or > > >>>>>> hundreds > > >>>>>> of abandoned PRs. > > >>>>>> > > >>>>>> What I would like to propose is to agree on some inactivity dead > > line, > > >>>>>>> > > >>>>>>> lets say 3 months. After crossing such deadline, PRs should be > > >>>>>> marked/commented as “stale”, with information like: > > >>>>>> > > >>>>>> “This pull request has been marked as stale due to 3 months of > > >>>>>>> > > >>>>>>> inactivity. It will be closed in 1 week if no further activity > > >>>>>> occurs. If > > >>>>>> you think that’s incorrect or this pull request requires a review, > > >>>>>> please > > >>>>>> simply write any comment.” > > >>>>>> > > >>>>>> Either we could just agree on such policy and enforce it manually > > >>>>>>> (maybe > > >>>>>>> > > >>>>>>> with some simple tooling, like a simple script to list inactive > > PRs - > > >>>>>> seems > > >>>>>> like couple of lines in python by using PyGithub) or we could > think > > >>>>>> about > > >>>>>> automating this action. There are some bots that do exactly this > > (like > > >>>>>> this > > >>>>>> one: https://github.com/probot/stale <https://github.com/probot/ > > stale > > >>>>>> > > > >>>>>> ), > > >>>>>> but probably they would need to be adopted to limitations of our > > >>>>>> Apache > > >>>>>> repository (we can not add labels and we can not close the PRs via > > >>>>>> GitHub). > > >>>>>> > > >>>>>> What do you think about it? > > >>>>>>> > > >>>>>>> Piotrek > > >>>>>>> > > >>>>>>> > > >>>> > > >> > > > > > >