Since we haven't heard any objections to this, the documentation has been
updated (Thanks to Sean).
All devs please make sure to re-read: http://spark.apache.org/contributing.html
.
Note the set of labels used in Jira has been documented and correctness or data
loss issues should be marked as blocker by default. There is also a label to
mark the jira as having something needing to go into the release-notes.
Tom
On Tuesday, August 14, 2018, 3:32:27 PM CDT, Imran Rashid
<[email protected]> wrote:
+1 on what we should do.
On Mon, Aug 13, 2018 at 3:06 PM, Tom Graves <[email protected]>
wrote:
> I mean, what are concrete steps beyond saying this is a problem? That's the
>important thing to discuss.
Sorry I'm a bit confused by your statement but also think I agree. I started
this thread for this reason. I pointed out that I thought it was a problem and
also brought up things I thought we could do to help fix it.
Maybe I wasn't clear in the first email, the list of things I had were
proposals on what we do for a jira that is for a correctness/data loss issue.
Its the committers and developers that are involved in this though so if people
don't agree or aren't going to do them, then it doesn't work.
Just to restate what I think we should do:
- label any correctness/data loss jira with "correctness"- jira should be
marked as a blocker by default if someone suspects a corruption/loss issue-
Make sure the description is clear about when it occurs and impact to the user.
- ensure its back ported to all active branches- See if we can have a
separate section in the release notes for these
The last one I guess is more a one time thing that i can file a jira for. The
first 4 would be done for each jira filed.
I'm proposing we do these things and as such if people agree we would also
document those things in the committers or developers guide and send email to
the list.
Tom On Monday, August 13, 2018, 11:17:22 AM CDT, Sean Owen
<[email protected]> wrote:
Generally: if someone thinks correctness fix X should be backported further,
I'd say just do it, if it's to an active release branch (see below). Anything
that important has to outweigh most any other concern, like behavior changes.
On Mon, Aug 13, 2018 at 11:08 AM Tom Graves <[email protected]> wrote:
I'm not really sure what you mean by this, this proposal is to introduce a
process for this type of issue so its at least brought to peoples attention. We
can't do anything to make people work on certain things. If they aren't raised
as important issues then its really easy to miss these things. If its a
blocker we should also not be doing any new releases without a fix for it which
may motivate people to look at it.
I mean, what are concrete steps beyond saying this is a problem? That's the
important thing to discuss.
There's a good one here: let's say anything that's likely to be a correctness
or data loss issue should automatically be labeled 'correctness' as such and
set to Blocker.
That can go into the how-to-contribute manual in the docs and in a note to
dev@.
I agree it would be good for us to make it more official about which branches
are being maintained. I think at this point its still 2.1.x, 2.2.x, and 2.3.x
since we recently did releases of all of these. Since 2.4 will be coming out
we should definitely think about stop maintaining 2.1.x. Perhaps we need a
table on our release page about this. But this should be a separate thread.
I propose writing something like this in the 'versioning' doc page, to at least
establish a policy:
Minor release branches will, generally, be maintained with bug fixes releases
for a period of 18 months. For example, branch 2.1.x is no longer considered
maintained as of July 2018, 18 months after the release of 2.1.0 in December
2106.
This gives us -- and more importantly users -- some understanding of what to
expect for backporting and fixes.
I am going to revive the thread about adding PMC / committers as it's overdue.
That may not do much, but, more hands to do more work ought to possibly free up
people to focus on deeper harder issues.