Re: [Discussion] Set further policies for triaging issues

Jarek Potiuk Sun, 12 Feb 2023 12:29:50 -0800

Maybe I am not getting the full extent of the proposal and maybe I
"hijacked" it a bit, but my comment was really related to (4) - new
issues. My mistake. Let me comment on those.


1) 2) 3)  -> This is good as a cleanup. We can do this as a manually
run bulk process to add such comments now for all such old issues and
run them periodically (every few months) if needed. That will be
simpler. I am fine with that. We can even initially run it manually
and if we find it useful we can turn it into a bot. But I think we
should not have to do it again and this is what I mostly commented on.

> In item (4) I suggested adding a needs-triage label to make sure that any 
> issue we get will have at least 1 committer/PMC/Triage member eyes on it.

Setting the label does not mean that someone will have eyes on it.
It's a good starting point though. I think measuring and incentivising
responsiveness for new issues is a key. And if we make sure that we
respond to issues in a timely manner, the current stale-bot is enough.
It will be closing only issues/PRs which are "pending response". All
the other issues will be either acknowledged by a maintainer and at
least vaguely planned to work on, or converted to a discussion, or
fixed or marked as "good first issue" for anyone to pick up.

J.



On Sun, Feb 12, 2023 at 8:37 PM Elad Kalif <elad...@apache.org> wrote:
>
> I'm not sure if the scenario you are worried about can happen?
>
> In item (4) I suggested adding a needs-triage label to make sure that any 
> issue we get will have at least 1 committer/PMC/Triage member eyes on it.
> In this step the issue can be accepted and then this label is replaced by 
> (reported_version) or rejected and be closed/converted to discussion.
> If a year passed and no one did anything with the issue the automation will 
> simply ask the user to let us know if the issue is still happening on newer 
> Airflow version.
> The issue may have already been solved and we just didn't notice. Assuming 
> the user won't comment in a defined time frame then it will close the issue 
> (if someone in the future will say we did wrong we can always reopen)
> This is basically what happens today just by manually process.
>
> - What happens if the user replies that it's reproducible?
> We will replace the previous reported_version with a new one (for example: 
> reported_version:2.0 -> reported_version:2.5) this will bump the issue to the 
> latest bug lists.
>
>
> On Sun, Feb 12, 2023 at 5:48 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>>
>> TL;DR; I think we should first solve the issue of improving our
>> "responsiveness" as committers first. I believe once we solve it, the
>> stale bot closing issue will be a useful and non-offensive tool.
>>
>> (Sorry for a loong email,  but I have been thinking a lot about it and
>> I had many observations that came from responding and triaging our
>> issues and PRs and discussions - this took likely 30%/40% of my time
>> over the last few months).
>>
>> Yes. I also have some serious doubts about closing issues "blindly" by
>> one criteria only by time of inactivity. I think this is just wrong
>> and we should not do it.
>>
>> I agree with Ash that this is really infuriating, when I opened an
>> issue, then 3 months have passed and it has been closed due to
>> inactivity. This is simply offensive - no matter if it's a bug or
>> issue or PR. But I think this is not a problem that we have to deal
>> better with the issues as maintainers, not that the stale bot is a
>> problem.
>>
>> But this is only one side of the story - If you have a stale issue
>> first and closed, when the issue is "pending response" from the user -
>> I have absolutely no problem with that. If the user who opened it is
>> asked for extra information and has not found a time to provide it -
>> there is absolutely no reason it should take the mental space of the
>> maintainers and we should close it automatically. We can always
>> re-open it if the user comes back with more information.
>>
>>
>> But coming back to closing issues without reaction from anyone. If
>> this kind of closing happens that we should be ashamed of that means
>> something else. That means that we as maintainers have done a bad job
>> in triaging this issue. This is really an indication that no-one -
>> neither regular contributors (which happens) nor maintainers (which
>> should look at it if no contributor does) found a time to read,
>> analyse and respond to an issue. No matter what response it will be.
>> ANY response from a maintainer (won'tfix, asking for more information,
>> asking others to provide more community to provide more evidence if
>> the issue is impossible to diagnose, convert to a discussion) is
>> better than silence. Way, way better.
>>
>> From my point of view, I think the real problem we have is that we
>> often have issues open for weeks or monhts without ANY interaction -
>> or there is no interaction after the user provided some kind of
>> response, additional information etc. Every now and then I do a
>> "streak" where I try to provide A response to EVERY issue and PRs
>> opened and not responded to for the last few weeks. And there are a
>> number of those for issues or PRs that are even 3-4 weeks without any
>> answer.
>>
>> And I am as guilty as everyone else here, but I have a feeling that if
>> we collectively as maintainers spend quite a good chunk of our time
>> triaging and responding to issues in due time. I think if we end up
>> with a situation where a user raises an issue or PR or provides a
>> feedback/new iteration etc. and there is absolutely no response for
>> more than a week - this is an indication we have a huge problem. And
>> the worst types of thoe are where someone "requests changes", those
>> changes are applied and the user pings the reviewer and there are
>> weeks of no response (even to multiple pings). Those happen rarely in
>> our and I think they are a bit even disrespectful to the users who had
>> "done their part".
>>
>> And I believe (I have no stats, just gut feeling) that we have that to
>> some extent - for features, bugs, PRs, discussions.
>>
>> If this happens a lot, then this is I think even equally offensive (or
>> even more) as closing stale issues. I think out of many stats,
>> "average response time" to an issue is absolutely most important to
>> see how good the community is in handling issues and PRs. This should
>> be to both - new issues and PRs. but also to issues that have been
>> opened and not responded after the user provided a response back.
>>
>> Now - many of those are not "intentional" and absolutely no "bad will"
>> - and it is mostly because we do not realise that we have a problem.
>>
>> We are all humans and have our daily issues and jobs and a lot of what
>> we do for our issues is done in our free time. But maybe we can
>> automate and improve that part - which in turn will make our stale bot
>> far "nicer" as it will only have to deal with the case where the
>> "user" has not provided necessary input and the maintainers looked and
>> responded to it.
>>
>> I do not have a very concrete proposal, but some vague ideas how this
>> could be improved:
>>
>> * Maybe we should start with building some simple stats and seeing our
>> responsiveness and find out if we really have a problem there. I am
>> sure there must be some tools for that and we might write ours if
>> needed - I remember we discussed similar issues in the past
>>
>> * Then maybe we can figure out a way to share the burden of reviews
>> between more committers somehow. For example identify issues and PRs
>> that have not been responded or followed up for some time and make
>> some way to incentivise and involve committers to provide feedback to
>> those
>>
>> * The stats could help us to understand if we are falling behind and
>> maybe we could have some weekly summary of stats that would help us
>> with understanding if we should do something and up-end our efforts in
>> triaging
>>
>> I think - if we do that then the only thing that Stale bot will be
>> doing is closing issues and PRs which had not received an input from
>> the user. Which is perfectly fine IMHO.
>>
>> On Sun, Feb 12, 2023 at 10:56 AM Ash Berlin-Taylor <a...@apache.org> wrote:
>> >
>> > Got it, yes that makes sense to me!
>> >
>> > On 12 February 2023 09:36:44 GMT, Elad Kalif <elad...@apache.org> wrote:
>> >>
>> >> Thanks for the comments my replies are in blue for all points raised.
>> >>
>> >> > We have currently more than 700 issues and many of them have had no 
>> >> > activity since a year. What will we do with those issues?
>> >>
>> >> Half of the open issues are feature requests thus will not be impacted. 
>> >> The thing I'm trying to resolve here is to know if the old issue is still 
>> >> reproducible on main/latest version. If so the issue will be tagged 
>> >> appropriately and will be kept open if the author does not respond. We 
>> >> can assume the issue is no longer relevant and close it.
>> >>
>> >> > Why close only stale issues not stale PR's?
>> >>
>> >> We already have that. Stale bot works for PR (excluding ones with pinned 
>> >> label)
>> >>
>> >> > There is nothing I find more infuriating and demoralising when dealing 
>> >> > with an open source project (and big ones like Kubernetes are the worst 
>> >> > offenders at this) where I find a bug or feature request is closed 
>> >> > simply due to lack of traction.
>> >>
>> >> I understand and share your concerns. First, this suggestion is just 
>> >> about bugs not about features. The automation calls for action from the 
>> >> author to recheck the issue.
>> >> This is something I'm doing today manually by going issue by issue and 
>> >> commenting the exact same thing "Is this issue happens in latest airflow 
>> >> version?" The auto close part is something that happens today when we add 
>> >> the pending-response label. My goal is to make sure that the list of open 
>> >> bugs we have is relevant. I'm not against larger intervals should we 
>> >> decide for it. To clarify I'm not suggesting to close bug reports because 
>> >> lack of attraction I'm suggesting to close reports that are not on recent 
>> >> versions of Airflow. In practice I don't see people trying to reproduce 
>> >> bugs reported on 2.0 in latest main - this simply doesn't happen so by 
>> >> having this process we are asking the author to recheck his report. If 
>> >> the issue is still reproducible then by letting us know that and by 
>> >> having the proper labels it might get more attraction to it.
>> >>
>> >>
>> >> On Sun, Feb 12, 2023 at 10:15 AM Ash Berlin-Taylor <a...@apache.org> 
>> >> wrote:
>> >>>
>> >>> I feel very strongly against automated closing of _issues_.
>> >>>
>> >>> There is nothing I find more infuriating and demoralising when dealing 
>> >>> with an open source project (and big ones like Kubernetes are the worst 
>> >>> offenders at this) where I find a bug or feature request is closed 
>> >>> simply due to lack of traction.
>> >>>
>> >>> I might be okay with a very long time (such as stale after 1 year and 
>> >>> close another year after that.)
>> >>>
>> >>> Ash
>> >>>
>> >>> On 12 February 2023 02:00:00 GMT, Pankaj Singh 
>> >>> <ags.pankaj1...@gmail.com> wrote:
>> >>>>
>> >>>> Hi Elad,
>> >>>>
>> >>>> Thanks for bringing this topic.
>> >>>>
>> >>>> I also feel we should have some automation to close the stale issue.
>> >>>>
>> >>>> Few questions I have
>> >>>> - We have currently more than 700 issues and many of them have had no 
>> >>>> activity since a year. What will we do with those issues?
>> >>>> - Why close only stale issues not stale PR's?
>> >>>>
>> >>>>
>> >>>> On Sun, Feb 12, 2023 at 1:23 AM Elad Kalif <elad...@apache.org> wrote:
>> >>>>>
>> >>>>> Hi everyone,
>> >>>>>
>> >>>>> It's been a while since we talked about the issue triage process. 
>> >>>>> Currently our process involves a lot of manual work of pinging issue 
>> >>>>> authors and I'm looking to automate some of it.
>> >>>>>
>> >>>>> Here are my suggestions:
>> >>>>>
>> >>>>> 1. add a new bot automation to detect core bug issues (kind:bug, 
>> >>>>> area:code) that are over 1 year old without any activity. The bot will 
>> >>>>> add a comment asking the user to check the issue against the latest 
>> >>>>> Airflow version and assign a "pending-response" label. If the user 
>> >>>>> will not respond the issue will be marked stale and will be closed by 
>> >>>>> our current stale bot automation. I suggest 1 year here because in 1 
>> >>>>> year we usually have 3 feature releases + many bug fixes which contain 
>> >>>>> a lot of fixes. We don't normally go back to check bugs on older 
>> >>>>> versions unless reporting as reproducible on the latest version. There 
>> >>>>> can be 2 outcomes of this:
>> >>>>>
>> >>>>> The author will comment and say it is reproducible in that case we 
>> >>>>> will assign the updated affected_version label and the issue will be 
>> >>>>> bumped up.
>> >>>>> The author will not comment. In that case we can assume the problem is 
>> >>>>> fixed/not relevant and the issue will be closed.
>> >>>>>
>> >>>>> 2. similar to (1) for providers with labels (kind:bug, area:provider) 
>> >>>>> and with a shortened time period of 6 months as providers release 
>> >>>>> frequently.
>> >>>>>
>> >>>>> 3. similar to (1) for airflow-client-python and airflow-client-go with 
>> >>>>> no labels and period of 6 months as well.
>> >>>>>
>> >>>>> 4. On another front, we sometimes miss the triage of new issues. My 
>> >>>>> suggestion is that any new issue opened will automatically have a 
>> >>>>> needs-triage label (this is practice several other projects use) That 
>> >>>>> way we can easily filter the list of issues that need first review. 
>> >>>>> When triaging the issue we will remove the label and assign proper 
>> >>>>> ones (good first issue, area, kind, etc..)
>> >>>>>
>> >>>>> What do others think?
>> >>>>>
>> >>>>> Elad
>> >>>>>
>> >>>>>

Re: [Discussion] Set further policies for triaging issues

Reply via email to