Hi Rich,
Thanks for raising this — just wanted to say that you’ve definitely
been heard, at least by me. I agree that stale or abandoned PRs are a
problem. I don’t have a concrete solution myself, I mostly just review
PRs whenever I find time — which is admittedly quite limited.

> In that specific case, is there a comment, tag, whatever, that I can add to 
> that PR to indicate “this PR has been verified to no longer be relevant, 
> please close”? I obviously cannot close it myself, since I rightfully lack 
> the necessary karma to do so.

At the moment, we don’t have a dedicated tag or comment process for
this. That said, I can create a GitHub label (e.g.,
stale-no-longer-relevant) that contributors can use to mark such PRs.
Those with the necessary permissions — including myself — could then
periodically review and close them. We might even consider a monthly
or bi-monthly sweep, depending on bandwidth and agreement. That said,
I’m not sure if non-committers can apply labels on PRs — I assume not.
If that’s the case, we’ll need to explore possible alternatives,
though nothing immediate comes to mind.

Also, as a general policy, we don’t commit PRs without a corresponding
JIRA ticket. That’s now explicitly mentioned in our GitHub PR template
as well. So if a PR lacks a JIRA, it's unlikely to be merged. There
are also quite a few PRs where a committer has already mentioned that
the change can’t be accepted, but we often leave those open — perhaps
to avoid signaling that the discussion is permanently closed, or like
the committer is trying to establish a precedent or their decision is
final.

> As an aside, I would note that the comments from hadoop-yetus are both hugely 
> informative, and also seriously unhelpful when they are repeated multiple 
> times, but it’s possible (likely?) that this has been addressed in the 10 
> years since then. I’m still working on the most ancient of the open PRs.

Yes, agreed — when a PR has multiple commits, Yetus can flood the
comment section. While we don't plan to disable it (since the test
history is often useful for identifying flaky tests or regressions), I
do think there might be ways to manage its verbosity better. GitHub
does allow comments to be hidden manually, and there may be options
via GitHub Actions or filters to do that automatically — something
worth exploring.

Let me see if I can carve out some time over the weekend to help clean
up a few PRs. And lastly, just want to say thank you — on behalf of
the community — for stepping in and helping out. Your effort and
initiative are truly appreciated.

Best,
Ayush

On Wed, 23 Jul 2025 at 19:44, Rich Bowen <rbo...@rcbowen.com> wrote:
>
> Hi, folks,
>
> I’ve been digging into the open PR queue over the past couple of days. To be 
> blunt, 1100 open PRs is hugely intimidating to someone trying to figure out 
> where to get started, and it’s also a “broken window” that makes a beginner 
> think that their contributions are likely to be overlooked. I’m trying to 
> figure out some practical ways that I can help address this.
>
> I see that there was a discussion - 
> https://lists.apache.org/thread/6g3n4wo3b3tpq2qxyyth3y8m9z4mcj8p - way back 
> in July 2021, about what to do with stale PRs (there were 400 at the time), 
> and I think the consensus then was that auto-closing abandoned PRs was a very 
> unfriendly thing to do. I have some sympathy with that position. But, I don’t 
> see any concrete followups on that in the years since.
>
> I wonder if I can take a case study and ask what I can practically do in this 
> situation, and then possibly generalize to other cases, so that I can help 
> chip away at this.
>
> Looking at https://github.com/apache/hadoop/pull/63 it’s a trivial PR to 
> address a typo. There’s some responses requesting improvements, but 
> ultimately the PR is abandoned. Meanwhile, the problem got solved elsewhere, 
> so the PR is no longer relevant. In that specific case, is there a comment, 
> tag, whatever, that I can add to that PR to indicate “this PR has been 
> verified to no longer be relevant, please close”? I obviously cannot close it 
> myself, since I rightfully lack the necessary karma to do so.
>
> This is just called out as an example. I have 84 other PRs that fall into 
> that specific category (ie, a typo fix that was ultimately abandoned with no 
> action) that I’d like to help with. I’m not looking to create work for 
> someone else, but looking for how I can help, and also document that process 
> for others to help work down the backlog.
>
> As an aside, I would note that the comments from hadoop-yetus are both hugely 
> informative, and also seriously unhelpful when they are repeated multiple 
> times, but it’s possible (likely?) that this has been addressed in the 10 
> years since then. I’m still working on the most ancient of the open PRs.
>
> Thanks!
>
> —Rich
>
>
> Stats, for those who enjoy stats:
>
> • **Total Open PRs:** 1,167
> • **PRs abandoned for 2+ years:** 684 (58.6% of all open PRs)
> • **PRs abandoned for 3+ years:** 477 (40.9% of all open PRs)
> • **PRs abandoned for 5+ years:** 128 (11.0% of all open PRs)
>
> (“Abandoned” is defined as no comment/update of any kind in that period.)
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to