I would make the case for interface stability not just api stability.
Particularly given that we have significantly changed some of our
interfaces, I want to ensure developers/users are not seeing red flags.

Bugs and code stability can be addressed in minor releases if found, but
behavioral change and/or interface changes would be a much more invasive
issue for our users.

Regards
Mridul
On 18-May-2014 2:19 am, "Matei Zaharia" <matei.zaha...@gmail.com> wrote:

> As others have said, the 1.0 milestone is about API stability, not about
> saying “we’ve eliminated all bugs”. The sooner you declare 1.0, the sooner
> users can confidently build on Spark, knowing that the application they
> build today will still run on Spark 1.9.9 three years from now. This is
> something that I’ve seen done badly (and experienced the effects thereof)
> in other big data projects, such as MapReduce and even YARN. The result is
> that you annoy users, you end up with a fragmented userbase where everyone
> is building against a different version, and you drastically slow down
> development.
>
> With a project as fast-growing as fast-growing as Spark in particular,
> there will be new bugs discovered and reported continuously, especially in
> the non-core components. Look at the graph of # of contributors in time to
> Spark: https://www.ohloh.net/p/apache-spark (bottom-most graph; “commits”
> changed when we started merging each patch as a single commit). This is not
> slowing down, and we need to have the culture now that we treat API
> stability and release numbers at the level expected for a 1.0 project
> instead of having people come in and randomly change the API.
>
> I’ll also note that the issues marked “blocker” were marked so by their
> reporters, since the reporter can set the priority. I don’t consider stuff
> like parallelize() not partitioning ranges in the same way as other
> collections a blocker — it’s a bug, it would be good to fix it, but it only
> affects a small number of use cases. Of course if we find a real blocker
> (in particular a regression from a previous version, or a feature that’s
> just completely broken), we will delay the release for that, but at some
> point you have to say “okay, this fix will go into the next maintenance
> release”. Maybe we need to write a clear policy for what the issue
> priorities mean.
>
> Finally, I believe it’s much better to have a culture where you can make
> releases on a regular schedule, and have the option to make a maintenance
> release in 3-4 days if you find new bugs, than one where you pile up stuff
> into each release. This is what much large project than us, like Linux, do,
> and it’s the only way to avoid indefinite stalling with a large contributor
> base. In the worst case, if you find a new bug that warrants immediate
> release, it goes into 1.0.1 a week after 1.0.0 (we can vote on 1.0.1 in
> three days with just your bug fix in it). And if you find an API that you’d
> like to improve, just add a new one and maybe deprecate the old one — at
> some point we have to respect our users and let them know that code they
> write today will still run tomorrow.
>
> Matei
>
> On May 17, 2014, at 10:32 AM, Kan Zhang <kzh...@apache.org> wrote:
>
> > +1 on the running commentary here, non-binding of course :-)
> >
> >
> > On Sat, May 17, 2014 at 8:44 AM, Andrew Ash <and...@andrewash.com>
> wrote:
> >
> >> +1 on the next release feeling more like a 0.10 than a 1.0
> >> On May 17, 2014 4:38 AM, "Mridul Muralidharan" <mri...@gmail.com>
> wrote:
> >>
> >>> I had echoed similar sentiments a while back when there was a
> discussion
> >>> around 0.10 vs 1.0 ... I would have preferred 0.10 to stabilize the api
> >>> changes, add missing functionality, go through a hardening release
> before
> >>> 1.0
> >>>
> >>> But the community preferred a 1.0 :-)
> >>>
> >>> Regards,
> >>> Mridul
> >>>
> >>> On 17-May-2014 3:19 pm, "Sean Owen" <so...@cloudera.com> wrote:
> >>>>
> >>>> On this note, non-binding commentary:
> >>>>
> >>>> Releases happen in local minima of change, usually created by
> >>>> internally enforced code freeze. Spark is incredibly busy now due to
> >>>> external factors -- recently a TLP, recently discovered by a large new
> >>>> audience, ease of contribution enabled by Github. It's getting like
> >>>> the first year of mainstream battle-testing in a month. It's been very
> >>>> hard to freeze anything! I see a number of non-trivial issues being
> >>>> reported, and I don't think it has been possible to triage all of
> >>>> them, even.
> >>>>
> >>>> Given the high rate of change, my instinct would have been to release
> >>>> 0.10.0 now. But won't it always be very busy? I do think the rate of
> >>>> significant issues will slow down.
> >>>>
> >>>> Version ain't nothing but a number, but if it has any meaning it's the
> >>>> semantic versioning meaning. 1.0 imposes extra handicaps around
> >>>> striving to maintain backwards-compatibility. That may end up being
> >>>> bent to fit in important changes that are going to be required in this
> >>>> continuing period of change. Hadoop does this all the time
> >>>> unfortunately and gets away with it, I suppose -- minor version
> >>>> releases are really major. (On the other extreme, HBase is at 0.98 and
> >>>> quite production-ready.)
> >>>>
> >>>> Just consider this a second vote for focus on fixes and 1.0.x rather
> >>>> than new features and 1.x. I think there are a few steps that could
> >>>> streamline triage of this flood of contributions, and make all of this
> >>>> easier, but that's for another thread.
> >>>>
> >>>>
> >>>> On Fri, May 16, 2014 at 8:50 PM, Mark Hamstra <
> m...@clearstorydata.com
> >>>
> >>> wrote:
> >>>>> +1, but just barely.  We've got quite a number of outstanding bugs
> >>>>> identified, and many of them have fixes in progress.  I'd hate to see
> >>> those
> >>>>> efforts get lost in a post-1.0.0 flood of new features targeted at
> >>> 1.1.0 --
> >>>>> in other words, I'd like to see 1.0.1 retain a high priority relative
> >>> to
> >>>>> 1.1.0.
> >>>>>
> >>>>> Looking through the unresolved JIRAs, it doesn't look like any of the
> >>>>> identified bugs are show-stoppers or strictly regressions (although I
> >>> will
> >>>>> note that one that I have in progress, SPARK-1749, is a bug that we
> >>>>> introduced with recent work -- it's not strictly a regression because
> >>> we
> >>>>> had equally bad but different behavior when the DAGScheduler
> >> exceptions
> >>>>> weren't previously being handled at all vs. being slightly
> >> mis-handled
> >>>>> now), so I'm not currently seeing a reason not to release.
> >>>
> >>
>
>

Reply via email to