I gave up looking through JIRAs a long time ago, so, big respect for continuing to try to triage them. I am afraid we're missing a few important bug reports in the torrent, but most JIRAs are not well-formed, just questions, stale, or simply things that won't be added. I do think it's important to reflect that reality, and so I'm always in favor of more aggressively closing JIRAs. I think this is more standard practice, from projects like TensorFlow/Keras, pandas, etc to just automatically drop Issues that don't see activity for N days. We won't do that, but, are probably on the other hand far too lax in closing them.
Remember that JIRAs stay searchable and can be reopened, so it's not like we lose much information. I'd close anything that hasn't had activity in 2 years (?), as a start. I like the idea of closing things that only affect an EOL release, but, many items aren't marked, so may need to cast the net wider. I think only then does it make sense to look at bothering to reproduce or evaluate the 1000s that will still remain. On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon <gurwls...@gmail.com> wrote: > > Hi all, > > I would like to propose to resolve all JIRAs that affects EOL releases - 2.2 > and below. and affected version > not specified. I was rather against this way and considered this as last > resort in roughly 3 years ago > when we discussed. Now I think we should go ahead with this. See below. > > I have been talking care of this for so long time almost every day those 3 > years. The number of JIRAs > keeps increasing and it does never go down. Now the number is going over 2500 > JIRAs. > Did you guys know? in JIRA, we can only go through page by page up to 1000 > items. So, currently we're even > having difficulties to go through every JIRA. We should manually filter out > and check each. > The number is going over the manageable size. > > I am not suggesting this without anything actually trying. This is what we > have tried within my visibility: > > 1. In roughly 3 years ago, Sean tried to gather committers and even > non-committers people to sort > out this number. At that time, we were only able to keep this number as > is. After we lost this momentum, > it kept increasing back. > 2. At least I scanned _all_ the previous JIRAs at least more than two times > and resolved them. Roughly > once a year. The rest of them are mostly obsolete but not enough > information to investigate further. > 3. I strictly stick to "Contributing to JIRA Maintenance" > https://spark.apache.org/contributing.html and > resolve JIRAs. > 4. Promoting other people to comment on JIRA or actively resolve them. > > One of the facts I realised is the increasing number of committers doesn't > virtually help this much (although > it might be helpful if somebody active in JIRA becomes a committer.) > > One of the important thing I should note is that, it's now almost pretty > difficult to reproduce and test the > issues found in EOL releases. We should git clone, checkout, build and test. > And then, see if that issue > still exists in upstream, and fix. This is non-trivial overhead. > > Therefore, I would like to propose resolving _all_ the JIRAs that targets EOL > releases - 2.2 and below. > Please let me know if anyone has some concerns or objections. > > Thanks. --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org