Not a specific jira but was looking at all the recent jiras with the "correctness" label and things are definitely being handled in consistently in my opinion (https://issues.apache.org/jira/issues/?jql=labels+%3D+correctness). The inconsistencies are in the things I've mentioned above. Priority is not set high enough, description is not clear, some backported to 2.2, some not. Obviously there could be ones without the "correctness" label as well since until recently I was also not aware that this label should be applied for this type of issues. We have no real guidelines in this area for developers and committers to follow so I think defining some would help everyone.
I realize everyone's time is important and everyone has different priorities but I think this sort of issue would be one we as a community should take care of above everything else. If I'm a business using Apache Spark for business critical things and I find that there is data loss or corruption issues consistently in the releases and its not our highest priority to fix, I'm going to very hesitant to use and stay with Spark. One specific example of priority is in the 2.4 code freeze/release thread where it was brought up to release without SPARK-23243. And really we have done a bunch of releases without this, but until recently it wasn't marked as a blocker as well. I'll admit that I missed this jira when it was filed and only recently became aware of it. I changed the priority on it. | I share frustration that Somebody should be working on Important Things, but don't think the difference between getting those done and not done is reminding people that Important Things need doing. What's the cause that leads to concrete corrective action? I'm not really sure what you mean by this, this proposal is to introduce a process for this type of issue so its at least brought to peoples attention. We can't do anything to make people work on certain things. If they aren't raised as important issues then its really easy to miss these things. If its a blocker we should also not be doing any new releases without a fix for it which may motivate people to look at it. I agree it would be good for us to make it more official about which branches are being maintained. I think at this point its still 2.1.x, 2.2.x, and 2.3.x since we recently did releases of all of these. Since 2.4 will be coming out we should definitely think about stop maintaining 2.1.x. Perhaps we need a table on our release page about this. But this should be a separate thread. Tom On Monday, August 13, 2018, 9:03:42 AM CDT, Sean Owen <sro...@gmail.com> wrote: I doubt the question is whether people want to take such issues seriously -- all else equal, of course everyone does. A JIRA label plus place in the release notes sounds like a good concrete step that isn't happening consistently now. That's a clear flag that at least one person believes issue X is a blocker. Is this about specific JIRAs? I think it's more useful to illustrate in the context of specific issues. For example I haven't been following JIRAs well, and don't know what is being contested here. I share frustration that Somebody should be working on Important Things, but don't think the difference between getting those done and not done is reminding people that Important Things need doing. What's the cause that leads to concrete corrective action? Do we need more committers? Fewer new features? More conservative releases? Less work on X to work on this? Lastly you raise an important question as an aside, one we haven't answered: when does a branch go inactive? I am sure 2.0.x is inactive, de facto, along with all 1.x. I think 2.1.x is inactive too. Should we put any rough guidance in place? a branch is maintained for 12-18 months? On Mon, Aug 13, 2018 at 8:45 AM Tom Graves <tgraves...@yahoo.com.invalid> wrote: Hello all, I've noticed some inconsistencies in the way we are handling data loss/correctness issues. I think we need to take these very seriously as they could be causing businesses real money and impacting real decisions and business logic. I would like to discuss how we can make sure these are handled consistently and with urgency going forward. A few things I would like to propose are below. Most of these are up to the developers and committers to ensure happen so want to know what everyone thinks and if people have other ideas? - label any correctness/data loss jira with "correctness"- jira marked as blocker by default if someone suspects a corruption/loss issue- Make sure description is clear about when it occurs and impact to the user. - ensure its back ported to all active branches- See if we can have a separate section in the release notes for these Thanks,Tom Graves