+1 (non-binding)
On Thu, Feb 28, 2019 at 9:11 AM Matt Cheah wrote:
> +1 (non-binding)
>
>
>
> *From: *Jamison Bennett
> *Date: *Thursday, February 28, 2019 at 8:28 AM
> *To: *Ryan Blue , Spark Dev List
> *Subject: *Re: [VOTE] SPIP: Spark API for Table Metadata
>
>
>
> +1 (non-binding)
>
>
> *J
I want to specifically highlight and +1 a point that Ryan brought up:
A commitment binds us to do this and make a reasonable attempt at finishing on
time. If we choose not to commit, or if we choose to commit and don’t make a
reasonable attempt, then we need to ask, “what happened?” Is Spark
I am -1 on this vote for pretty much all the reasons that Mark mentioned.
A major version change gives us an opportunity to remove deprecated
interfaces, stabilize experimental/developer api, drop support for
outdated functionality/platforms and evolve the project with a vision
for foreseeable fu
I'm not worried about rushing. I worry that, without clear parameters for
the amount or types of DSv2 delays that are acceptable, we might end up
holding back 3.0 indefinitely to meet the deadline when we wouldn't have
made that decision de novo. (Or even worse, the PMC eventually feels they
must r
The question is, what does it bind?
I’m not pushing for a binding statement to do this or delay the 3.0 release
because I don’t think that’s a very reasonable thing to do. It may well be
that there is a good reason for missing the goal.
So “what does it bind?” is an apt question.
A commitment bi
Hi there,
Would you be able to give advise on how to best compare a previous row value in
a structured streaming DF with the current one?
Kind regards,
Raphael
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
This is a fine thing to VOTE on. Committers (and community,
non-binding) can VOTE on what we like; we just don't do it often where
not required because it's a) overkill overhead over simple lazy
consensus, and b) it can be hard to say what the binding VOTE binds if
it's not a discrete commit or rel
I’m sure we, as a community, will seriously consider any proposal that
Spark would benefit if the PMC delays release X to include changes A, B, C.
Indeed, every release I remember has had a few iterations of “can we hold
the train for a bit because it would be super great to get this PR in”.
Many
I agree that adding new features in a major release is not forbidden, but
that is just not the primary goal of a major release. If we reach the point
where we are happy with the new public API before some new features are in
a satisfactory state to be merged, then I don't want there to be a prior
p
Mark, I disagree. Setting common goals is a critical part of getting things
done.
This doesn't commit the community to push out the release if the goals
aren't met, but does mean that we will, as a community, seriously consider
it. This is also an acknowledgement that this is the most important fe
This should be fine. Dataset.groupByKey is a logical operation, not a
physical one (as in Spark wouldn’t always materialize all the groups in
memory).
On Thu, Feb 28, 2019 at 1:46 AM Etienne Chauchot
wrote:
> Hi all,
>
> I'm migrating RDD pipelines to Dataset and I saw that Combine.PerKey is no
Mark, if this goal is adopted, "we" is the Apache Spark community.
On Thu, Feb 28, 2019 at 9:52 AM Mark Hamstra
wrote:
> Who is "we" in these statements, such as "we should consider a functional
> DSv2 implementation a blocker for Spark 3.0"? If it means those
> contributing to the DSv2 effort w
Then I'm -1. Setting new features as blockers of major releases is not
proper project management, IMO.
On Thu, Feb 28, 2019 at 10:06 AM Ryan Blue wrote:
> Mark, if this goal is adopted, "we" is the Apache Spark community.
>
> On Thu, Feb 28, 2019 at 9:52 AM Mark Hamstra
> wrote:
>
>> Who is "we
Who is "we" in these statements, such as "we should consider a functional
DSv2 implementation a blocker for Spark 3.0"? If it means those
contributing to the DSv2 effort want to set their own goals, milestones,
etc., then that is fine with me. If you mean that the Apache Spark project
should offici
+1 (non-binding)
From: Jamison Bennett
Date: Thursday, February 28, 2019 at 8:28 AM
To: Ryan Blue , Spark Dev List
Subject: Re: [VOTE] SPIP: Spark API for Table Metadata
+1 (non-binding)
Jamison Bennett
Cloudera Software Engineer
jamison.benn...@cloudera.com
515 Congress Ave, Suite
+1 (non-binding)
Are identifiers and namespaces going to be rolled under one of those six points?
From: Ryan Blue
Reply-To: "rb...@netflix.com"
Date: Thursday, February 28, 2019 at 8:39 AM
To: Spark Dev List
Subject: [VOTE] Functional DataSourceV2 in Spark 3.0
I’d like to call a vote
Thanks for the discussion, everyone. Since there aren't many objections to
the scope and we are aligned on what this commitment would mean, I've
started a vote thread for it.
rb
On Wed, Feb 27, 2019 at 5:32 PM Wenchen Fan wrote:
> I'm good with the list from Ryan, thanks!
>
> On Thu, Feb 28, 20
I’d like to call a vote for committing to getting DataSourceV2 in a
functional state for Spark 3.0.
For more context, please see the discussion thread, but here is a quick
summary about what this commitment means:
- We think that a “functional DSv2” is an achievable goal for the Spark
3.0 r
+1 (non-binding)
Jamison Bennett
Cloudera Software Engineer
jamison.benn...@cloudera.com
515 Congress Ave, Suite 1212 | Austin, TX | 78701
On Thu, Feb 28, 2019 at 10:20 AM Ryan Blue
wrote:
> +1 (non-binding)
>
> On Wed, Feb 27, 2019 at 8:34 PM Russell Spitzer
> wrote:
>
>> +1 (non-
+1 (non-binding)
On Wed, Feb 27, 2019 at 8:34 PM Russell Spitzer
wrote:
> +1 (non-binding)
>
> On Wed, Feb 27, 2019, 6:28 PM Ryan Blue wrote:
>
>> Hi everyone,
>>
>> In the last DSv2 sync, the consensus was that the table metadata SPIP was
>> ready to bring up for a vote. Now that the multi-cat
Hi all,
I'm migrating RDD pipelines to Dataset and I saw that Combine.PerKey is no more
there in the Dataset API. So, I
translated it to:
KeyValueGroupedDataset> groupedDataset =
keyedDataset.groupByKey(KVHelpers.extractKey(),
EncoderHelpers.genericEncoder());
Dataset> combinedDataset =
21 matches
Mail list logo