To add to this, we can add a stable interface anytime if the original one was
marked as unstable; we wouldn’t have to wait until 4.0. We had a lot of APIs
that were experimental in 2.0 and then got stabilized in later 2.x releases for
example.
Matei
> On Feb 26, 2019, at 5:12 PM, Reynold Xin
We will have to fix that before we declare dev2 is stable, because
InternalRow is not a stable API. We don’t necessarily need to do it in 3.0.
On Tue, Feb 26, 2019 at 5:10 PM Matt Cheah wrote:
> Will that then require an API break down the line? Do we save that for
> Spark 4?
>
>
>
> -Matt Cheah
Will that then require an API break down the line? Do we save that for Spark 4?
-Matt Cheah?
From: Ryan Blue
Reply-To: "rb...@netflix.com"
Date: Tuesday, February 26, 2019 at 4:53 PM
To: Matt Cheah
Cc: Sean Owen , Wenchen Fan , Xiao Li
, Matei Zaharia , Spark Dev
List
Subject: Re: [DI
That's a good question.
While I'd love to have a solution for that, I don't think it is a good idea
to delay DSv2 until we have one. That is going to require a lot of internal
changes and I don't see how we could make the release date if we are
including an InternalRow replacement.
On Tue, Feb 26
Thanks for bumping this, Matt. I think we can have the discussion here to
clarify exactly what we’re committing to and then have a vote thread once
we’re agreed.
Getting back to the DSv2 discussion, I think we have a good handle on what
would be added:
- Plugin system for catalogs
- TableCa
Reynold made a note earlier about a proper Row API that isn’t InternalRow – is
that still on the table?
-Matt Cheah
From: Ryan Blue
Reply-To: "rb...@netflix.com"
Date: Tuesday, February 26, 2019 at 4:40 PM
To: Matt Cheah
Cc: Sean Owen , Wenchen Fan , Xiao Li
, Matei Zaharia , Spark Dev
What would then be the next steps we'd take to collectively decide on plans and
timelines moving forward? Might I suggest scheduling a conference call with
appropriate PMCs to put our ideas together? Maybe such a discussion can take
place at next week's meeting? Or do we need to have a separate
Yes, I agree thats its a valid concern and leads to individual contributors
giving up on new ideas or major improvements.
On Tue, 26 Feb 2019 at 15:24, Jungtaek Lim wrote:
> Adding one more, it implicitly leads individual contributors to give up
> with challenging major things and just focus on
Mr Torres can you give these a pass please?
On Tue, Feb 26, 2019 at 4:38 PM Jungtaek Lim wrote:
>
> Hi devs,
>
> sorry to bring this again to mailing list, but you know, ping in Github PR
> just doesn't work.
>
> I have long-stand (created in last year) PRs on SS area which already got
> over 1
Adding one more, it implicitly leads individual contributors to give up
with challenging major things and just focus on minor things, which would
even help on project, but not in the long run. We don't have roadmap put
into wall and let whole community share the load together, so individual
contrib
Thanks Sean, as always, to share your thought quickly!
I agree most of points, except "they add a lot of code and complexity
relative to benefit", since no one can weigh on something before at least
taking quick review. IMHO if someone would think so, better to speak (I
know it's hard and being a
Those aren't bad changes, but they add a lot of code and complexity
relative to benefit. I think it's positive that you've gotten people
to spend time reviewing them, quite a lot. I don't know whether they
should be merged. This isn't a 'bug' though; not all changes should be
committed. Simple and
In case there are issues visiting Google doc, I attached PDF files to the
JIRA.
On Tue, Feb 26, 2019 at 7:41 AM Xingbo Jiang wrote:
> Hi all,
>
> I want send a revised SPIP on implementing Accelerator(GPU)-aware
> Scheduling. It improves Spark by making it aware of GPUs exposed by cluster
> mana
Hi devs,
sorry to bring this again to mailing list, but you know, ping in Github PR
just doesn't work.
I have long-stand (created in last year) PRs on SS area which already got
over 100 comments (so community and me already put lots of efforts) but no
progress in point of view for being merged un
jenkins is churning through a lot of github updates, and i'm finally seeing
the backlog of pull requests builds starting.
i'll keep an eye on things over the afternoon.
On Tue, Feb 26, 2019 at 12:26 PM shane knapp wrote:
> restarted jenkins, staring at logs. will report back when things look
I understand the reason about storing information along with data for
transactional committing, but it mostly makes sense if we store outputs
along with all necessary checkpoint information via transactional manner.
Spark doesn't store query checkpoint along with outputs.
I feel this is regarding
restarted jenkins, staring at logs. will report back when things look good.
On Tue, Feb 26, 2019 at 12:22 PM shane knapp wrote:
> investigating, and this will most likely require a jenkins restart.
>
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs
yeah, i'm on it.
On Tue, Feb 26, 2019 at 11:39 AM Xiao Li wrote:
> Thanks for reporting it! It sounds like Shane is working on it. I manually
> triggered the test for the PR https://github.com/apache/spark/pull/23894
> .
>
> Cheers,
>
> Xiao
>
>
> Bruce Robbins 于2019年2月26日周二 上午11:33写道:
>
>> Sor
investigating, and this will most likely require a jenkins restart.
--
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu
Thanks for reporting it! It sounds like Shane is working on it. I manually
triggered the test for the PR https://github.com/apache/spark/pull/23894 .
Cheers,
Xiao
Bruce Robbins 于2019年2月26日周二 上午11:33写道:
> Sorry for stating what is likely obvious, but PR tests don't appear to be
> running. Last
Sorry for stating what is likely obvious, but PR tests don't appear to be
running. Last one started was around 2AM.
Hi everyone,
With 12 +1 votes and no +0 or -1 votes, this SPIP passes. Thanks to
everyone that participated in the discussions and voted!
rb
On Thu, Feb 21, 2019 at 12:14 AM Xiao Li wrote:
> +1 This is in the right direction. The resolution rules and catalog APIs
> need more discussion when we
Hi all,
I want send a revised SPIP on implementing Accelerator(GPU)-aware
Scheduling. It improves Spark by making it aware of GPUs exposed by cluster
managers, and hence Spark can match GPU resources with user task requests
properly. If you have scenarios that need to run workloads(DL/ML/Signal
Pr
Thank you both for the reply. Chris and I have very similar use cases for
cogroup.
One of the goals for groupby apply + pandas UDF was to avoid things like
collect list and reshaping data between Spark and Pandas. Cogroup feels
very similar and can be an extension to the groupby apply + pandas UDF
24 matches
Mail list logo