I think this proposal is a good set of trade-offs and has existed in the
community for a long period of time. I especially appreciate how the design
is focused on a minimal useful component, with future optimizations
considered from a point of view of making sure it's flexible, but actual
concrete
+1, great idea.
On Fri, Feb 12, 2021 at 6:40 PM Yuming Wang wrote:
> +1.
>
> On Sat, Feb 13, 2021 at 10:38 AM Takeshi Yamamuro
> wrote:
>
>> +1, too. Thanks, Dongjoon!
>>
>> 2021/02/13 11:07、Xiao Li のメール:
>>
>>
>> +1
>>
>> Happy Lunar New Year!
>>
>> Xiao
>>
>> On Fri, Feb 12, 2021 at 5:33 PM
Git blame is a good way to figure out likely potential reviewers (eg who’s
been working in the area). Another is who filed the JIRA if it’s not you.
On Thu, Feb 18, 2021 at 6:58 AM Enrico Minack
wrote:
> Hi Spark Developers,
>
> I have a fundamental question on the process of contributing to Apa
+1 (binding)
On Mon, Mar 8, 2021 at 3:56 PM Ryan Blue wrote:
> Hi everyone, I’d like to start a vote for the FunctionCatalog design
> proposal (SPIP).
>
> The proposal is to add a FunctionCatalog interface that can be used to
> load and list functions for Spark to call. There are interfaces for
I think having pandas support inside of Spark makes sense. One of my
questions is who are the majour contributors to this effort, is the
community developing the pandas API layer for Spark interested in being
part of Spark or do they prefer having their own release cycle?
On Sat, Mar 13, 2021 at 5
+1
On Sun, Mar 28, 2021 at 10:25 PM sarutak wrote:
> +1 (non-binding)
>
> - Kousuke
>
> > +1 (non-binding)
> >
> > On Sun, Mar 28, 2021 at 9:06 PM 郑瑞峰
> > wrote:
> >
> >> +1 (non-binding)
> >>
> >> -- 原始邮件 --
> >>
> >> 发件人: "Maxim Gekk" ;
> >> 发送时间: 2021年3月29日(星期
Thanks Shane for keeping the build infra structure running for all of
these years :)
I've got some Kubernetes infra on AS399306 down in HE in Fremont but
it's also perhaps not of the newest variety, but so far no disk
failures or anything like that (knock on wood of course). The catch is
it's on a
What about if we just turn off the PV tests for now?
I'd be happy to help with the debugging/upgrading.
On Thu, Apr 15, 2021 at 2:28 AM Rob Vesse wrote:
>
> There’s at least one test (the persistent volumes one) that relies on some
> Minikube functionality because we run integration tests for ou
I verified the virtualenv & pyspark installation on OSX & linux works
as expected with the minimum version of Python.
I double checked the Python tagged versions (mostly checking Python
3.8 wasn't listed since Spark 2.x only does up to 3.7). It might be
good to include that as a reminder in the re
Hi Folks,
I've deployed a new version of K3s locally and I ran into an issue
with the key format not being supported out of the box. We delegate to
fabric8 which has bouncy castle EC as an optional dependency. Adding
it would add ~6mb to the Kube jars. What do folks think?
Cheers,
Holden
P.S.
+1 - pip install with Py 2.7 works (with the understandable warnings
regarding Python 2.7 no longer being maintained).
On Mon, May 10, 2021 at 11:18 AM sarutak wrote:
>
> +1 (non-binding)
>
> - Kousuke
>
> > It looks like the repository is "open" - it doesn't publish until
> > "closed" after all
+1 and thanks for volunteering to be the RM :)
On Mon, May 17, 2021 at 4:09 PM Takeshi Yamamuro
wrote:
> Thank you, Dongjoon~ sgtm, too.
>
> On Tue, May 18, 2021 at 7:34 AM Cheng Su wrote:
>
>> +1 for a new release, thanks Dongjoon!
>>
>> Cheng Su
>>
>> On 5/17/21, 2:44 PM, "Liang-Chi Hsieh"
Hi Folks,
I'm continuing my adventures to make Spark on containers party and I
was wondering if folks have experience with the different batch
scheduler options that they prefer? I was thinking so that we can
better support dynamic allocation it might make sense for us to
support using different s
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable
sclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monet
and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or
I took an initial look at the PRs this morning and I’ll go through the
design doc in more detail but I think these features look great. It’s
especially important with the CA regulation changes to make this easier for
folks to implement.
On Thu, Jun 24, 2021 at 4:54 PM Anton Okolnychyi
wrote:
> H
gt;>>
>>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility
>>>>>>> for any loss, damage or destruction of data or any other property which
>>>>>>> may
>>>>>>> arise from relying on this
2021 at 8:56 AM Holden Karau wrote:
> That's awesome, I'm just starting to get context around Volcano but maybe
> we can schedule an initial meeting for all of us interested in pursuing
> this to get on the same page.
>
> On Wed, Jun 23, 2021 at 6:54 PM Klaus Ma wrote:
&g
I noticed that the worker decommissioning suite maybe seems to be running
up against the memory limits so I'm going to try and see if I can get our
memory usage down a bit as well while we wait for GH response. In the
meantime, I'm assuming if things pass Jenkins we are OK with merging yes?
On Wed
Hi Folks,
Many other distributed computing (https://hub.docker.com/r/rayproject/ray
https://hub.docker.com/u/daskdev) and ASF projects (
https://hub.docker.com/u/apache) now publish their images to dockerhub.
We've already got the docker image tooling in place, I think we'd need to
ask the ASF to
ding data from and transferring data to Postgres / Greenplum with
>>> Spark SQL and DataFrames, 10~100x faster.*
>>> *itatchi <https://github.com/yaooqinn/spark-func-extras>A** library t**hat
>>> brings useful functions from various modern database management syste
&data=04%7C01%7CMeikel.Bode%40bertelsmann.de%7Cd97d97be540246aa975308d95e260c99%7C1ca8bd943c974fc68955bad266b43f0b%7C0%7C0%7C637644339790689711%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=4YYZ61B6datdx2GsxqnEUOpYuJUn35egYRQSVnUxtF0%3D&reserved=0>*
>>
I don’t think we need a new repo for working on proposed Dockerfiles. You
can take a look at the existing Dockerfiles, file a JIRA, and make a fork,
then raise a PR (eg follow the usual development process).
On Sun, Aug 15, 2021 at 9:51 AM Mich Talebzadeh
wrote:
>
> Maybe this one?
>
>
> https:/
.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Sun, 15 Aug 2021 at 20:24, Holden Karau wrote:
>
>> I don’t think we need a new repo for working on proposed Dockerfiles. You
>>
tps://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicit
useful stuff for most
>> users/organisations. My suggestions is to create for a given type (spark,
>> spark-py etc):
>>
>>
>>1. One vanilla flavour for everyday use with few useful packages
>>2. One for medium use with most common packages for ETL/ELT
nds this seems like a lot (if I recall
>>>>>>> correctly it was around 400MB for existing images).
>>>>>>>
>>>>>>>
>>>>>>> On 8/17/21 2:24 PM, Mich Talebzadeh wrote:
>>>>>>>
>>>>>>
Hi Y'all,
This just recently came up but I'm not super sure on how we want to handle
this in general. If code was committed under the lazy consensus model and
then a committer or PMC -1s it post merge, what do we want to do?
I know we had some previous discussion around -1s, but that was largely
Hi Folks,
I'm wondering what people think about the idea of having the Spark UI
(optionally) act as a proxy to the executors? This could help with exec UI
access in some deployment environments.
Cheers,
Holden :)
--
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performan
>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case
So I tried turning on the Spark exec UI proxy but it broke the Spark UI (in
3.1.2) and regardless of what URL I requested everything came back as
text/html of the jobs page. Is anyone actively using this feature in prod?
On Sun, Aug 22, 2021 at 5:58 PM Holden Karau wrote:
> Oh cool. I’ll h
Hi Folks,
I'm going through the Spark 3.2 tickets just to make sure were not missing
anything important and I was wondering what folks thoughts are on adding
Spark 4 so we can target API breaking changes to the next major version and
avoid loosing track of the issue.
Cheers,
Holden :)
--
Twit
I think even if we do cancel this RC we should leave it open for a bit to
see if we can catch any other errors.
On Mon, Sep 27, 2021 at 12:29 PM Dongjoon Hyun
wrote:
> Unfortunately, it's the same for me recently. Not only that, but I also
> hit MetaspaceSize OOM, too.
> I ended up with MAVEN_OP
PySpark smoke tests pass, I'm going to do a last pass through the JIRAs
before my vote though.
On Wed, Sep 29, 2021 at 8:54 AM Sean Owen wrote:
> +1 looks good to me as before, now that a few recent issues are resolved.
>
>
> On Tue, Sep 28, 2021 at 10:45 AM Gengliang Wang wrote:
>
>> Please vo
+1
On Sun, Oct 10, 2021 at 10:46 PM Wenchen Fan wrote:
> +1
>
> On Sat, Oct 9, 2021 at 2:36 PM angers zhu wrote:
>
>> +1 (non-binding)
>>
>> Cheng Pan 于2021年10月9日周六 下午2:06写道:
>>
>>> +1 (non-binding)
>>>
>>> Integration test passed[1] with my project[2].
>>>
>>> [1]
>>> https://github.com/house
+1
On Fri, Oct 29, 2021 at 3:07 PM DB Tsai wrote:
> +1
>
> DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1
>
>
> On Fri, Oct 29, 2021 at 11:42 AM Ryan Blue wrote:
>
>> +1
>>
>> On Fri, Oct 29, 2021 at 11:06 AM huaxin gao
>> wrote:
>>
>>> +1
>>>
>>> On Fri, Oct 29, 2021 at 10:59 AM
Sorry I've been busy, I'll try and take a look tomorrow, excited to see
this progress though :)
On Wed, Nov 10, 2021 at 9:01 PM Hyukjin Kwon wrote:
> Last reminder: I plan to merge this in a few more days. Any feedback and
> review would be very appreciated.
>
> On Tue, 9 Nov 2021 at 21:51, Hyuk
Thanks for putting this together, I’m really excited for us to add better
batch scheduling integrations.
On Tue, Nov 30, 2021 at 12:46 AM Yikun Jiang wrote:
> Hey everyone,
>
> I'd like to start a discussion on "Support Volcano/Alternative Schedulers
> Proposal".
>
> This SPIP is proposed to mak
Shane you kick ass thank you for everything you’ve done for us :) Keep on
rocking :)
On Mon, Dec 6, 2021 at 4:24 PM Hyukjin Kwon wrote:
> Thanks, Shane.
>
> On Tue, 7 Dec 2021 at 09:19, Dongjoon Hyun
> wrote:
>
>> I really want to thank you for all your help.
>> You've done so many things for t
My understanding is it only applies to log4j 2+ so we don’t need to do
anything.
On Sun, Dec 12, 2021 at 8:46 PM Pralabh Kumar
wrote:
> Hi developers, users
>
> Spark is built using log4j 1.2.17 . Is there a plan to upgrade based on
> recent CVE detected ?
>
>
> Regards
> Pralabh kumar
>
--
Tw
blematic.
>>>>>>>>
>>>>>>>> Definitely yes, we are on the same page.
>>>>>>>>
>>>>>>>> I think we have the same goal: propose a general and reasonable
>>>>>>>> mechanism to ma
+1 (binding)
On Wed, Jan 5, 2022 at 5:31 PM William Wang wrote:
> +1 (non-binding)
>
> Yikun Jiang 于2022年1月6日周四 09:07写道:
>
>> Hi all,
>>
>> I’d like to start a vote for SPIP: "Support Customized Kubernetes
>> Schedulers Proposal"
>>
>> The SPIP is to support customized Kubernetes schedulers in
Personally I’d love to see us compiling and testing on Linux arm64 as well.
On Sat, Jan 8, 2022 at 7:49 PM Yikun Jiang wrote:
> BTW, this is not intended to be in potential opposition to Apache Spark
> Infra 2022 which dongjoon mentioned in "Apache Spark Jenkins Infra 2022".
> It is just to shar
On Fri, Jan 21, 2022 at 6:48 PM Sean Owen wrote:
> Continue on the ticket - I am not sure this is established. We would block
> a release for critical problems that are not regressions. This is not a
> data loss / 'deleting data' issue even if valid.
> You're welcome to provide feedback but votes
Please vote on releasing the following candidate as Apache Spark version
3.1.3.
The vote is open until Feb. 4th at 5 PM PST (1 AM UTC + 1 day) and passes
if a majority
+1 PMC votes are cast, with a minimum of 3 + 1 votes.
[ ] +1 Release this package as Apache Spark 3.1.3
[ ] -1 Do not release thi
+1 (binding)
On Thu, Feb 3, 2022 at 2:26 PM Erik Krogen wrote:
> +1 (non-binding)
>
> Really looking forward to having this natively supported by Spark, so that
> we can get rid of our own hacks to tie in a custom view catalog
> implementation. I appreciate the care John has put into various par
>> December (Dec 6) when we were talking about release 3.2.1.
>>>>
>>>> Tom
>>>>
>>>> On Wed, Feb 2, 2022 at 2:07 AM Mridul Muralidharan
>>>> wrote:
>>>> >
>>>> > Hi Holden,
>>>> >
>>>> &
Yup, I’ve run into some weirdness with docs again I want to verify before I
send the vote email though.
On Mon, Feb 7, 2022 at 10:06 PM Wenchen Fan wrote:
> Shall we use the release scripts of branch 3.1 to release 3.1?
>
> On Fri, Feb 4, 2022 at 4:57 AM Holden Karau wrote:
>
&
Please vote on releasing the following candidate as Apache Spark version
3.1.3.
The vote is open until Feb. 18th at 1 PM pacific (9 PM GMT) and passes if a
majority
+1 PMC votes are cast, with a minimum of 3 + 1 votes.
[ ] +1 Release this package as Apache Spark 3.1.3
[ ] -1 Do not release this p
ith -Pyarn -Pmesos -Pkubernetes
>>
>> Regards,
>> Mridul
>>
>>
>> On Wed, Feb 16, 2022 at 8:32 AM Thomas graves wrote:
>>
>>> +1
>>>
>>> Tom
>>>
>>> On Mon, Feb 14, 2022 at 2:55 PM Holden Karau
>>> wrote:
&g
The vote passes with no 0s or -1s and the following +1:
Holden Karau
John Zhuge
Mridul Muralidharan
Thomas graves
Gengliang Wang
Wenchen Fan
Yuming Wang
Ruifeng Zheng
Sean Owen
I will begin finalizing the release now.
On Fri, Feb 18, 2022 at 2:49 PM Holden Karau wrote:
> +1 my self :)
>
We are happy to announce the availability of Spark 3.1.3!
Spark 3.1.3 is a maintenance release containing stability fixes. This
release is based on the branch-3.1 maintenance branch of Spark. We strongly
recommend all 3.1 users to upgrade to this stable release.
To download Spark 3.1.3, head over
g on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Mon, 21 Feb 2022 at 21:09, Holden Karau wrote:
>
>> We are happy to
.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author
12-8-jre-slim-buster latest
>>>>> 31ed15daa2bf 12 hours ago
>>>>> 531MB
>>>>>
>>>>> Then push it with (example)
>>>>>
>>>>> docker push apache/spark/tags/spark-3.1
>
> ps. Any plans to make this images official docker images at some point
> (for the extra security/validation) [1]
> [1] https://docs.docker.com/docker-hub/official_images/
>
> On Mon, Feb 21, 2022 at 10:09 PM Holden Karau
> wrote:
> >
> > We are happy to ann
CVEs are generally not mentioned in the release notes or JIRA instead we
track them at https://spark.apache.org/security.html once they are resolved
(prior to the resolution the reports goes to secur...@spark.apache.org) to
allow the project time to fix the issue before public disclosure so there
i
On Mon, Mar 14, 2022 at 11:53 PM Xiao Li wrote:
> Could you please list which features we want to finish before the branch
> cut? How long will they take?
>
> Xiao
>
> Chao Sun 于2022年3月14日周一 13:30写道:
>
>> Hi Max,
>>
>> As there are still some ongoing work for the above listed SPIPs, can we
>> st
May I suggest we push out one week (22nd) just to give everyone a bit of
breathing space? Rushed software development more often results in bugs.
On Tue, Mar 15, 2022 at 6:23 AM Yikun Jiang wrote:
> > To make our release time more predictable, let us collect the PRs and
> wait three more days be
68][SQL] Row-level Runtime Filtering
> > >> #34659 [SPARK-34863][SQL] Support complex types for Parquet
> vectorized reader
> > >> #35848 [SPARK-38548][SQL] New SQL function: try_sum
> > >>
> > >> Do you mean we should include them, or exclude them
Technically release don't follow vetos (see
https://www.apache.org/foundation/voting.html ) it's up to the RM if they
get the minimum number of binding +1s (although they are encouraged to
cancel the release if any serious issues are raised).
That being said I'll add my -1 based on the issues repo
> On Wed, May 11, 2022 at 4:23 AM Hyukjin Kwon wrote:
>
>> I expect to see RC2 too. I guess he just sticks to the standard, leaving
>> the vote open till the end.
>> It hasn't got enough +1s anyway :-).
>>
>> On Wed, 11 May 2022 at 10:17, Holden Karau wrote:
>
Oh that’s rad 😊
On Tue, May 17, 2022 at 7:47 AM bo yang wrote:
> Hi Spark Folks,
>
> I built a web reverse proxy to access Spark UI on Kubernetes (working
> together with https://github.com/GoogleCloudPlatform/spark-on-k8s-operator).
> Want to share here in case other people have similar need.
>
Could we make it do the same sort of history server fallback approach?
On Tue, May 17, 2022 at 10:41 PM bo yang wrote:
> It is like Web Application Proxy in YARN (
> https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WebApplicationProxy.html),
> to provide easy access for Spark U
+1
On Mon, Jun 13, 2022 at 4:51 PM Yuming Wang wrote:
> +1 (non-binding)
>
> On Tue, Jun 14, 2022 at 7:41 AM Dongjoon Hyun
> wrote:
>
>> +1
>>
>> Thanks,
>> Dongjoon.
>>
>> On Mon, Jun 13, 2022 at 3:54 PM Chris Nauroth
>> wrote:
>>
>>> +1 (non-binding)
>>>
>>> I repeated all checks I described
+1
On Thu, Jun 16, 2022 at 7:17 AM Thomas Graves wrote:
> +1 for the concept.
> Correct me if I'm wrong, but at a high level this is proposing adding
> a new user API (which is language agnostic) and the proposal is to
> start with something like the Logical Plan, with the addition of being
> ab
How about a hallway meet up at Data AI summit to talk about build CI if
folks are
Interested?
On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon wrote:
> Increased the priority to a blocker - I don't think we can release with
> these build failures and poor CI
>
> On Mon, 20 Jun 2022 at 10:39, Hyukjin
I’ve run Jupyter w/Spark on K8s, haven’t tried it with Dataproc personally.
The Spark K8s pod scheduler is now more pluggable for Yunikorn and Volcano
can be used with less effort.
On Mon, Sep 5, 2022 at 7:44 AM Mich Talebzadeh
wrote:
>
> Hi,
>
>
> Has anyone got experience of running Jupyter o
ion of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Mon, 5 Sept 2022 at 1
rise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>> On Mon, 5 Sept 2022 at 20
Do we want to start syndicating Apache Spark Twitter to a Mastodon
instance. It seems like a lot of software dev folks are moving over there
and it would be good to reach our users where they are.
Any objections / concerns? Any thoughts on which server we should pick if
we do this?
--
Twitter: ht
?
>
> I believe the most devs are still using Twitter.
>
>
> чт, 1 дек. 2022 г., 01:35 Holden Karau :
>
>> Do we want to start syndicating Apache Spark Twitter to a Mastodon
>> instance. It seems like a lot of software dev folks are moving over there
>> and it would
s are in Twitter)
> For Federated features, I think Slack would be a better platform, a lot
> of Apache Big data projects have slack for federated features
>
> чт, 1 дек. 2022 г., 02:33 Holden Karau :
>
>> I agree that there is probably a majority still on twitter, but it would
&
Hi Folks,
It seems like we could maybe use some additional shared context around
Spark on Kube so I’d like to try and schedule a virtual coffee session.
Who all would be interested in virtual adventures around Spark on Kube
development?
No pressure if the idea of hanging out in a virtual chat wi
ys to use
> spark.
>
> Thanks!
> Andrew
>
> On Tue, Feb 7, 2023 at 5:24 PM Holden Karau wrote:
> >
> > Hi Folks,
> >
> > It seems like we could maybe use some additional shared context around
> Spark on Kube so I’d like to try and schedule a virtual
ll
make another doodle for the following week with more european friendly
times.
Let me know what folks think :)
On Tue, Feb 7, 2023 at 3:23 PM Holden Karau wrote:
> Hi Folks,
>
> It seems like we could maybe use some additional shared context around
> Spark on Kube so I’d like to try an
ngh :
>>>>>>>>
>>>>>>>>> Greetings everyone!
>>>>>>>>> I am super new to this group and currently leading some work to
>>>>>>>>> deploy spark on k8 for my company o9 Solutions.
>>>>>
able for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Wed, 8 Feb 2023 at 20:12, Holden Karau wrote:
>
>> My thought here was that it's more focused on getting to understand each
>> other's goals / priorities an
ail's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Fri, 10 Feb 2023 at 18:58, Holden Karau wrote:
>
>> Ok so the first iteration of this
I’d be in favor of a back porting with the idea its a bug fix for a
language (admittedly not a version we’ve supported before)
On Mon, Feb 13, 2023 at 9:19 AM L. C. Hsieh wrote:
> If it is not supported in Spark 3.3.x, it looks like an improvement at
> Spark 3.4.
> For such cases we usually do n
That’s legit, if the patch author isn’t comfortable with a backport then
let’s leave it be 👍
On Mon, Feb 13, 2023 at 9:59 AM Dongjoon Hyun
wrote:
> Hi, All.
>
> As the author of that `Improvement` patch, I strongly disagree with giving
> the wrong idea which Python 3.11 is officially supported i
Is there someone focused on streaming work these days who would want to
shepherd this?
On Sat, Feb 18, 2023 at 5:02 PM Dongjoon Hyun
wrote:
> Thank you for considering me, but may I ask what makes you think to put me
> there, Mich? I'm curious about your reason.
>
> > I have put dongjoon.hyun as
I am +1 to the general concept of including Ammonite magic 🪄.
On Wed, Mar 22, 2023 at 4:58 PM Herman van Hovell
wrote:
> Ammonite is maintained externally by Li Haoyi et al. We are including it
> as a 'provided' dependency. The integration bits and pieces (1 file) are
> included in Apache Spark.
+1
On Tue, Apr 4, 2023 at 11:04 AM L. C. Hsieh wrote:
> +1
>
> Sounds good and thanks Dongjoon for driving this.
>
> On 2023/04/04 17:24:54 Dongjoon Hyun wrote:
> > Hi, All.
> >
> > Since Apache Spark 3.2.0 passed RC7 vote on October 12, 2021, branch-3.2
> > has been maintained and served well u
I think there was some concern around how to make any sync channel show up
in logs / index / search results?
On Fri, Apr 7, 2023 at 9:41 AM Dongjoon Hyun
wrote:
> Thank you, All.
>
> I'm very satisfied with the focused and right questions for the real
> issues by removing irrelevant claims. :)
>
So I think if the Spark PMC wants to ask Databricks something that could be
reasonable (although I'm a little fuzzy as to the ask), but that
conversation might belong on private@ (I could be wrong of course).
On Tue, Jun 6, 2023 at 3:29 AM Mich Talebzadeh
wrote:
> I concur with you Sean.
>
> If
So JDK 11 is still supported in open JDK until 2026, I'm not sure if we're
going to see enough folks moving to JRE17 by the Spark 4 release unless we
have a strong benefit from dropping 11 support I'd be inclined to keep it.
On Tue, Jun 6, 2023 at 9:08 PM Dongjoon Hyun wrote:
> I'm also +1 on dr
-0
I'd like to see more of a doc around what we're planning on for a 4.0
before we pick a target release date etc. (feels like cart before the
horse).
But it's a weak preference.
On Mon, Jun 12, 2023 at 11:24 AM Xiao Li wrote:
> Thanks for starting the vote.
>
> I do have a concern about the t
My self and a few folks have been working on a spark-upgrade project
(focused on getting folks onto current versions of Spark). Since it looks
like were starting the discussion around Spark 4 I was thinking now could
be a good time for us to consider if we want to try and integrate
auto-upgrade rul
Yup I think buidling consensus on what goes in 4.X is something we’ll need
to do.
On Mon, Jun 12, 2023 at 11:56 AM Dongjoon Hyun
wrote:
> Thank you for sharing those. I'm also interested in taking advantage of
> it. Also, I hope `spark-upgrade` can help us in line with Spark 4.0.
>
> However, we
ut if it only entails changing to
>>> Scala 2.13 and dropping support for JDK 8, then we could also just release
>>> a month after 3.5.
>>>
>>> How about we do this? We get 3.5 released, and afterwards we do a couple
>>> of meetings where we build this road
I’d like to start with a +1, better Python testing tools integrated into
the project make sense.
On Wed, Jun 21, 2023 at 8:11 AM Amanda Liu
wrote:
> Hi all,
>
> I'd like to start the vote for SPIP: PySpark Test Framework.
>
> The high-level summary for the SPIP is that it proposes an official te
Wed, Jun 21, 2023 at 8:30 AM Reynold Xin wrote:
> +1
>
> This is a great idea.
>
>
> On Wed, Jun 21, 2023 at 8:29 AM, Holden Karau
> wrote:
>
>> I’d like to start with a +1, better Python testing tools integrated into
>> the project make sense.
>>
>
+1
On Fri, Jul 7, 2023 at 9:55 AM huaxin gao wrote:
> +1
>
> On Fri, Jul 7, 2023 at 8:59 AM Mich Talebzadeh
> wrote:
>
>> +1 for me
>>
>> Mich Talebzadeh,
>> Solutions Architect/Engineering Lead
>> Palantir Technologies Limited
>> London
>> United Kingdom
>>
>>
>>view my Linkedin profile
>>
So I wondering if there is interesting in revisiting some of how Spark is
doing it's dynamica allocation for Spark 4+?
Some things that I've been thinking about:
- Advisory user input (e.g. a way to say after X is done I know I need Y
where Y might be a bunch of GPU machines)
- Configurable toler
Oh great point
On Mon, Aug 7, 2023 at 2:23 PM bo yang wrote:
> Thanks Holden for bringing this up!
>
> Maybe another thing to think about is how to make dynamic allocation more
> friendly with Kubernetes and disaggregated shuffle storage?
>
>
>
> On Mon, Aug 7, 2023
Oooh fascinating. I’m going on call this week so it will take me awhile but
I do want to review this :)
On Mon, Aug 7, 2023 at 5:30 PM Pavan Kotikalapudi
wrote:
> Hi Spark Dev,
>
> I have extended traditional DRA to work for structured streaming
> use-case.
>
> Here is an initial Implementation
Maybe add a link to the 4.0 JIRA where we are tracking the current plans
for 4.0?
On Tue, Aug 8, 2023 at 9:33 AM Dongjoon Hyun
wrote:
> Thank you, Matei.
>
> It looks good to me.
>
> Dongjoon
>
> On Mon, Aug 7, 2023 at 22:54 Matei Zaharia
> wrote:
>
>> It’s time to send our quarterly report to
2023 at 23:42, Mich Talebzadeh
>>>> wrote:
>>>>
>>>>
>>>>
>>>> Hi,
>>>>
>>>>
>>>>
>>>> From what I have seen spark on a serverless cluster has hard up getting
>>>> the dr
1 - 100 of 552 matches
Mail list logo