Re: [VOTE] Functional DataSourceV2 in Spark 3.0

2019-02-28 Thread Matt Cheah
I want to specifically highlight and +1 a point that Ryan brought up: A commitment binds us to do this and make a reasonable attempt at finishing on time. If we choose not to commit, or if we choose to commit and don’t make a reasonable attempt, then we need to ask, “what happened?” Is Spark

Re: [VOTE] Functional DataSourceV2 in Spark 3.0

2019-02-28 Thread Mridul Muralidharan
I am -1 on this vote for pretty much all the reasons that Mark mentioned. A major version change gives us an opportunity to remove deprecated interfaces, stabilize experimental/developer api, drop support for outdated functionality/platforms and evolve the project with a vision for foreseeable fu

Re: [VOTE] Functional DataSourceV2 in Spark 3.0

2019-02-28 Thread Joseph Torres
I'm not worried about rushing. I worry that, without clear parameters for the amount or types of DSv2 delays that are acceptable, we might end up holding back 3.0 indefinitely to meet the deadline when we wouldn't have made that decision de novo. (Or even worse, the PMC eventually feels they must r

Re: [VOTE] Functional DataSourceV2 in Spark 3.0

2019-02-28 Thread Ryan Blue
The question is, what does it bind? I’m not pushing for a binding statement to do this or delay the 3.0 release because I don’t think that’s a very reasonable thing to do. It may well be that there is a good reason for missing the goal. So “what does it bind?” is an apt question. A commitment bi

Re: [VOTE] Functional DataSourceV2 in Spark 3.0

2019-02-28 Thread Sean Owen
This is a fine thing to VOTE on. Committers (and community, non-binding) can VOTE on what we like; we just don't do it often where not required because it's a) overkill overhead over simple lazy consensus, and b) it can be hard to say what the binding VOTE binds if it's not a discrete commit or rel

Re: [VOTE] Functional DataSourceV2 in Spark 3.0

2019-02-28 Thread Joseph Torres
I’m sure we, as a community, will seriously consider any proposal that Spark would benefit if the PMC delays release X to include changes A, B, C. Indeed, every release I remember has had a few iterations of “can we hold the train for a bit because it would be super great to get this PR in”. Many

Re: [VOTE] Functional DataSourceV2 in Spark 3.0

2019-02-28 Thread Mark Hamstra
I agree that adding new features in a major release is not forbidden, but that is just not the primary goal of a major release. If we reach the point where we are happy with the new public API before some new features are in a satisfactory state to be merged, then I don't want there to be a prior p

Re: [VOTE] Functional DataSourceV2 in Spark 3.0

2019-02-28 Thread Ryan Blue
Mark, I disagree. Setting common goals is a critical part of getting things done. This doesn't commit the community to push out the release if the goals aren't met, but does mean that we will, as a community, seriously consider it. This is also an acknowledgement that this is the most important fe

Re: [VOTE] Functional DataSourceV2 in Spark 3.0

2019-02-28 Thread Ryan Blue
Mark, if this goal is adopted, "we" is the Apache Spark community. On Thu, Feb 28, 2019 at 9:52 AM Mark Hamstra wrote: > Who is "we" in these statements, such as "we should consider a functional > DSv2 implementation a blocker for Spark 3.0"? If it means those > contributing to the DSv2 effort w

Re: [VOTE] Functional DataSourceV2 in Spark 3.0

2019-02-28 Thread Mark Hamstra
Then I'm -1. Setting new features as blockers of major releases is not proper project management, IMO. On Thu, Feb 28, 2019 at 10:06 AM Ryan Blue wrote: > Mark, if this goal is adopted, "we" is the Apache Spark community. > > On Thu, Feb 28, 2019 at 9:52 AM Mark Hamstra > wrote: > >> Who is "we

Re: [VOTE] Functional DataSourceV2 in Spark 3.0

2019-02-28 Thread Mark Hamstra
Who is "we" in these statements, such as "we should consider a functional DSv2 implementation a blocker for Spark 3.0"? If it means those contributing to the DSv2 effort want to set their own goals, milestones, etc., then that is fine with me. If you mean that the Apache Spark project should offici

Re: [VOTE] Functional DataSourceV2 in Spark 3.0

2019-02-28 Thread Matt Cheah
+1 (non-binding) Are identifiers and namespaces going to be rolled under one of those six points? From: Ryan Blue Reply-To: "rb...@netflix.com" Date: Thursday, February 28, 2019 at 8:39 AM To: Spark Dev List Subject: [VOTE] Functional DataSourceV2 in Spark 3.0 I’d like to call a vote