+1 in favor of some sort of JIRA cleanup.

My only request is that we attach some sort of 'bulk-closed' label to
issues that we close via JIRA filter batch operations (and resolve the
issues as "Timed Out" / "Cannot Reproduce", not "Fixed"). Using a label
makes it easier to audit what was closed, simplifying the process of
identifying and re-opening valid issues caught in our dragnet.


On Wed, May 15, 2019 at 7:19 AM Sean Owen <sro...@gmail.com> wrote:

> I gave up looking through JIRAs a long time ago, so, big respect for
> continuing to try to triage them. I am afraid we're missing a few
> important bug reports in the torrent, but most JIRAs are not
> well-formed, just questions, stale, or simply things that won't be
> added. I do think it's important to reflect that reality, and so I'm
> always in favor of more aggressively closing JIRAs. I think this is
> more standard practice, from projects like TensorFlow/Keras, pandas,
> etc to just automatically drop Issues that don't see activity for N
> days. We won't do that, but, are probably on the other hand far too
> lax in closing them.
>
> Remember that JIRAs stay searchable and can be reopened, so it's not
> like we lose much information.
>
> I'd close anything that hasn't had activity in 2 years (?), as a start.
> I like the idea of closing things that only affect an EOL release,
> but, many items aren't marked, so may need to cast the net wider.
>
> I think only then does it make sense to look at bothering to reproduce
> or evaluate the 1000s that will still remain.
>
> On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon <gurwls...@gmail.com> wrote:
> >
> > Hi all,
> >
> > I would like to propose to resolve all JIRAs that affects EOL releases -
> 2.2 and below. and affected version
> > not specified. I was rather against this way and considered this as last
> resort in roughly 3 years ago
> > when we discussed. Now I think we should go ahead with this. See below.
> >
> > I have been talking care of this for so long time almost every day those
> 3 years. The number of JIRAs
> > keeps increasing and it does never go down. Now the number is going over
> 2500 JIRAs.
> > Did you guys know? in JIRA, we can only go through page by page up to
> 1000 items. So, currently we're even
> > having difficulties to go through every JIRA. We should manually filter
> out and check each.
> > The number is going over the manageable size.
> >
> > I am not suggesting this without anything actually trying. This is what
> we have tried within my visibility:
> >
> >   1. In roughly 3 years ago, Sean tried to gather committers and even
> non-committers people to sort
> >     out this number. At that time, we were only able to keep this number
> as is. After we lost this momentum,
> >     it kept increasing back.
> >   2. At least I scanned _all_ the previous JIRAs at least more than two
> times and resolved them. Roughly
> >     once a year. The rest of them are mostly obsolete but not enough
> information to investigate further.
> >   3. I strictly stick to "Contributing to JIRA Maintenance"
> https://spark.apache.org/contributing.html and
> >     resolve JIRAs.
> >   4. Promoting other people to comment on JIRA or actively resolve them.
> >
> > One of the facts I realised is the increasing number of committers
> doesn't virtually help this much (although
> > it might be helpful if somebody active in JIRA becomes a committer.)
> >
> > One of the important thing I should note is that, it's now almost pretty
> difficult to reproduce and test the
> > issues found in EOL releases. We should git clone, checkout, build and
> test. And then, see if that issue
> > still exists in upstream, and fix. This is non-trivial overhead.
> >
> > Therefore, I would like to propose resolving _all_ the JIRAs that
> targets EOL releases - 2.2 and below.
> > Please let me know if anyone has some concerns or objections.
> >
> > Thanks.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Reply via email to