t have any external
>>> dependencies added, and just updated the Spark version as mentioned below.
>>>
>>> Can someone help me with this?
>>>
>>> On Fri, 1 Sep 2023 at 5:58 PM, Koert Kuipers wrote:
>>>
&g
ncies added, and just updated the Spark version as mentioned below.
>>
>> Can someone help me with this?
>>
>> On Fri, 1 Sep 2023 at 5:58 PM, Koert Kuipers wrote:
>>
>>> could the provided scope be the issue?
>>>
>>> On Sun, Aug 27, 2023 a
<dev.dipaya...@gmail.com> wrote:Using the following dependency for Spark 3 in POM file (My Scala version is 2.12.14) org.elasticsearch elasticsearch-spark-30_2.12 7.12.0 providedThe code throws error at this line : df.write.format("es").mode("overwrite").options(elas
p 2023 at 5:58 PM, Koert Kuipers wrote:
>
>> could the provided scope be the issue?
>>
>> On Sun, Aug 27, 2023 at 2:58 PM Dipayan Dev
>> wrote:
>>
>>> Using the following dependency for Spark 3 in POM file (My Scala version
>>> is 2.12.14)
>>
pported/unsupported when df.write is used in Spark 3 when the underlying
> custom datasource is using SupportCatalogOptions.
>
> As per my knowledge, in the current implementation in master, df.write in
> Append and Overwrite mode tries to load the table and look for the schema
>
Hi team,
I would like to know the behaviour of Append & Overwrite modes when table
is not present and whether automatic table creation is
supported/unsupported when df.write is used in Spark 3 when the underlying
custom datasource is using SupportCatalogOptions.
As per my knowledge, in
e 1500 that I have on a Hadoop/YARN cluster, and was
> not able to reproduce the difference in execution time between Spark 2 and
> Spark 3 that you report in your mail.
>
> This is the Spark config I used:
>
> bin/spark-shell --master yarn --driver-memory 8g --executor-cor
Hi Senthil,
I have just run a couple of quick tests for TPCDS Q4, using the TPCDS schema
created at scale 1500 that I have on a Hadoop/YARN cluster, and was not able to
reproduce the difference in execution time between Spark 2 and Spark 3 that you
report in your mail.
This is the Spark
okia.com> wrote:
>
>> Hi Senthil,
>>
>>
>>
>> Which version of Spark 3 are we using? We had this kind of observation
>> with Spark 3.0.2 and 3.1.x, but then we figured out that we had configured
>> big value for spark.network.timeout and this value was
@abhishek. We use spark 3.1*
On Mon, 20 Dec 2021, 09:50 Rao, Abhishek (Nokia - IN/Bangalore), <
abhishek@nokia.com> wrote:
> Hi Senthil,
>
>
>
> Which version of Spark 3 are we using? We had this kind of observation
> with Spark 3.0.2 and 3.1.x, but then we figured ou
Hi Senthil,
Which version of Spark 3 are we using? We had this kind of observation with
Spark 3.0.2 and 3.1.x, but then we figured out that we had configured big value
for spark.network.timeout and this value was not taking effect in all releases
prior to 3.0.2.
This was fixed as part of https
Hi All,
We are comparing Spark 2.4.5 and Spark 3(without enabling spark 3
additional features) with TPCDS queries and found that Spark 3's
performance is reduced to at-least 30-40% compared to Spark 2.4.5.
Eg.
Data size used 1TB
Spark 2.4.5 finishes the Q4 in 1.5 min, but Spark 3.* tak
At the moment this is really about discovering GPUs, so that the scheduler
can schedule tasks that need to allocate whole GPUs.
On Sat, Jul 17, 2021 at 5:14 PM ayan guha wrote:
> Hi
>
> As I was going through Spark 3 config params, I noticed following group of
> params. I could no
Hi
As I was going through Spark 3 config params, I noticed following group of
params. I could not understand what are they for. Can anyone please point
me in the right direction?
spark.driver.resource.{resourceName}.amount 0 Amount of a particular
resource type to use on the driver. If this is
Hi,
I'm the tech lead on RasterFrames, which adds geospatial raster data
capability to Apache Spark SQL. We are trying to migrate to Spark 3.x, and
are struggling with getting our various DataSources to work, and wondered
if some might share some tips on what might be going on. Most of our i
Hello,
I'm using spark-3.0.0-bin-hadoop3.2 with custom hive metastore DB
(postgres). I'm setting the "autoCreateAll" flag to true, so hive is
creating its relational schema on first use. The problem is there is a
deadlock and the query hangs forever:
*Tx1* (*holds lock on TBLS relation*, wait_even
If I had to guess, it's likely because the Spark code would have to read
the YAML to make sure the required parameters are set, and the way it's
done was just easier to build on without a lot of refactoring.
On Mon, Jul 6, 2020 at 5:06 PM Michel Sumbul wrote:
> Thanks Edward for the reply!
>
> I
Okay, I see what's going on here.
Looks like the way that spark is coded, the driver container image
(specified by --conf
spark.kubernetes.driver.container.image) and executor container image
(specified by --conf
spark.kubernetes.executor.container.image) is required.
If they're not specified it'
Hi Edeesis,
The goal is to not have these settings in the spark submit command. If I
specify the same things in a pod template for the executor, I still got the
message:
"Exception in thread "main" org.apache.spark.SparkException "Must specify
the driver container image"
it even don't try to star
If I could muster a guess, you still need to specify the executor image. As
is, this will only specify the driver image.
You can specify it as --conf spark.kubernetes.container.image or --conf
spark.kubernetes.executor.container.image
--
Sent from: http://apache-spark-developers-list.1001551.n3
Hello,
Adding the dev mailing list maybe there is someone here that can help to
have/show a valid/accepted pod template for spark 3?
Thanks in advance,
Michel
Le ven. 26 juin 2020 à 14:03, Michel Sumbul a
écrit :
> Hi Jorge,
> If I set that in the spark submit command it works but I w
ik Erlandson 님이 작성:
>>>>>
>>>>>> I'd be willing to pull this in, unless others have concerns post
>>>>>> branch-cut.
>>>>>>
>>>>>> On Tue, Feb 4, 2020 at 2:51 PM Holden Karau
>>>>>> wro
iscuss.
>>>>
>>>> 2020년 2월 9일 (일) 오전 1:23, Erik Erlandson 님이 작성:
>>>>
>>>>> I'd be willing to pull this in, unless others have concerns post
>>>>> branch-cut.
>>>>>
>>>>> On Tue, Feb 4, 2020 at 2:51 PM Hold
; wrote:
>>>>
>>>>> Hi Y’all,
>>>>>
>>>>> I’ve got a K8s graceful decom PR (
>>>>> https://github.com/apache/spark/pull/26440
>>>>> ) I’d love to try and get in for Spark 3, but I don’t want to push on
>
PM Holden Karau
>>> wrote:
>>>
>>>> Hi Y’all,
>>>>
>>>> I’ve got a K8s graceful decom PR (
>>>> https://github.com/apache/spark/pull/26440
>>>> ) I’d love to try and get in for Spark 3, but I don’t want to push on
>>>&
willing to pull this in, unless others have concerns post
>> branch-cut.
>>
>> On Tue, Feb 4, 2020 at 2:51 PM Holden Karau wrote:
>>
>>> Hi Y’all,
>>>
>>> I’ve got a K8s graceful decom PR (
>>> https://github.com/apache/spark/pull/26440
&
; I’ve got a K8s graceful decom PR (
>>> https://github.com/apache/spark/pull/26440
>>> ) I’d love to try and get in for Spark 3, but I don’t want to push on
>>> it if folks don’t think it’s worth it. I’ve been working on it since 2017
>>> and it was really close
For follow up while I've backported this in some internal releases I'm not
considering a candidate for backporting to Spark 3 anymore. I should have
updated the thread with that. The design doc is linked in the PR.
On Thu, Jun 18, 2020 at 6:05 PM Hyukjin Kwon wrote:
> Looks it h
;>
>> I’ve got a K8s graceful decom PR (
>> https://github.com/apache/spark/pull/26440
>> ) I’d love to try and get in for Spark 3, but I don’t want to push on it
>> if folks don’t think it’s worth it. I’ve been working on it since 2017 and
>> it was really close in
rrently we are using
InternalRow).
On Sat, Feb 29, 2020 at 8:39 AM Mihir Sahu
wrote:
> Hi Team,
>
> Wanted to know ahead of developing new datasource for Spark 3.x. Shall
> it be done using Datasource V2 or Datasource V1(via Relation) or there is
> any other plan.
>
> W
Hi Team,
Wanted to know ahead of developing new datasource for Spark 3.x. Shall
it be done using Datasource V2 or Datasource V1(via Relation) or there is
any other plan.
When I tried to build datasource using V2 for Spark 3.0, I could not
find the associated classes and they seems to be
I'd be willing to pull this in, unless others have concerns post branch-cut.
On Tue, Feb 4, 2020 at 2:51 PM Holden Karau wrote:
> Hi Y’all,
>
> I’ve got a K8s graceful decom PR (
> https://github.com/apache/spark/pull/26440
> ) I’d love to try and get in for Spark 3, but
Hi Y’all,
I’ve got a K8s graceful decom PR (
https://github.com/apache/spark/pull/26440
) I’d love to try and get in for Spark 3, but I don’t want to push on it
if folks don’t think it’s worth it. I’ve been working on it since 2017 and
it was really close in November but then I had the crash and
park Graph for 3.0!
Best regards
Mats, Martin
Neo4j
On Tue, Sep 17, 2019 at 8:35 PM Matt Cheah wrote:
> I don’t know if it will be feasible to merge all of SPARK-25299 into Spark
> 3. There are a number of APIs that will be submitted for review, and I
> wouldn’t want to block the rel
I don’t know if it will be feasible to merge all of SPARK-25299 into Spark 3.
There are a number of APIs that will be submitted for review, and I wouldn’t
want to block the release on negotiating these changes, as the decisions we
make for each API can be pretty involved.
Our original plan
299> - Use remote storage
> for persisting shuffle data
> https://issues.apache.org/jira/browse/SPARK-25299
>
> If that is far enough along to get onto the roadmap.
>
>
> On Wed, Sep 11, 2019 at 11:37 AM Sean Owen wrote:
>
>> I'm curious what current feelings
Owen wrote:
> I'm curious what current feelings are about ramping down towards a
> Spark 3 release. It feels close to ready. There is no fixed date,
> though in the past we had informally tossed around "back end of 2019".
> For reference, Spark 1 was May 2014, Spark 2 was J
> wrote:
> +1 Like the idea as a user and a DSv2 contributor.
>
> On Thu, Sep 12, 2019 at 4:10 PM Jungtaek Lim <mailto:kabh...@gmail.com>> wrote:
> +1 (as a contributor) from me to have preview release on Spark 3 as it would
> help to test the feature. When to cut pr
I don't expect to see a large DS V2 API change from now on. But we may
update the API a little bit if we find problems during the preview.
On Sat, Sep 14, 2019 at 10:16 PM Sean Owen wrote:
> I don't think this suggests anything is finalized, including APIs. I
> would not guess there will be majo
I don't think this suggests anything is finalized, including APIs. I
would not guess there will be major changes from here though.
On Fri, Sep 13, 2019 at 4:27 PM Andrew Melo wrote:
>
> Hi Spark Aficionados-
>
> On Fri, Sep 13, 2019 at 15:08 Ryan Blue wrote:
>>
>> +1 for a preview release.
>>
>>
2.0.0-preview.
>>
>> https://archive.apache.org/dist/spark/spark-2.0.0-preview/
>>
>> And, thank you, Xingbo!
>> Could you take a look at website generation? It seems to be broken on
>> `master`.
>>
>> Bests,
>> Dongjoon.
>>
>>
>>
te generation? It seems to be broken on
> `master`.
>
> Bests,
> Dongjoon.
>
>
> On Fri, Sep 13, 2019 at 11:30 AM Xingbo Jiang
> wrote:
>
>> Hi all,
>>
>> I would like to volunteer to be the release manager of Spark 3 preview,
>> thanks!
>>
>&
:
> Hi all,
>
> I would like to volunteer to be the release manager of Spark 3 preview,
> thanks!
>
> Sean Owen 于2019年9月13日周五 上午11:21写道:
>
>> Well, great to hear the unanimous support for a Spark 3 preview
>> release. Now, I don't know how to make releases myself
Hi all,
I would like to volunteer to be the release manager of Spark 3 preview,
thanks!
Sean Owen 于2019年9月13日周五 上午11:21写道:
> Well, great to hear the unanimous support for a Spark 3 preview
> release. Now, I don't know how to make releases myself :) I would
> first open it up
Well, great to hear the unanimous support for a Spark 3 preview
release. Now, I don't know how to make releases myself :) I would
first open it up to our revered release managers: would anyone be
interested in trying to make one? sounds like it's not too soon to get
what's in
t
>>>>> testing against and exploring so they can raise issues with us earlier in
>>>>> the process and we have more time to make calls around this.
>>>>>
>>>>> On Thu, Sep 12, 2019 at 4:15 PM John Zhuge wrote:
>>>>>
>&
rom the PoV of giving folks something to start testing
>>>> against and exploring so they can raise issues with us earlier in the
>>>> process and we have more time to make calls around this.
>>>>
>>>> On Thu, Sep 12, 2019 at 4:15 PM John Z
0 in early 2020
>>>
>>> For JDK11 clean-up, it will meet the timeline and `3.0.0-preview` helps
>>> it a lot.
>>>
>>> After this discussion, can we have some timeline for `Spark 3.0 Release
>>> Window` in our versioning-policy page?
>>>
&
as a user and a DSv2 contributor.
>>
>> On Thu, Sep 12, 2019 at 4:10 PM Jungtaek Lim wrote:
>>
>> +1 (as a contributor) from me to have preview release on Spark 3 as it
>> would help to test the feature. When to cut preview release is
>> questionable, as major
taek Lim < kabhwan@ gmail. com (
>> kabh...@gmail.com ) > wrote:
>>
>>
>>> +1 (as a contributor) from me to have preview release on Spark 3 as it
>>> would help to test the feature. When to cut preview release is
>>> questionable, as major works a
Sv2 contributor.
>
> On Thu, Sep 12, 2019 at 4:10 PM Jungtaek Lim wrote:
>
>> +1 (as a contributor) from me to have preview release on Spark 3 as it
>> would help to test the feature. When to cut preview release is
>> questionable, as major works are ideally to be done before t
+1 as both a contributor and a user.
From: John Zhuge
Date: Thursday, September 12, 2019 at 4:15 PM
To: Jungtaek Lim
Cc: Jean Georges Perrin , Hyukjin Kwon ,
Dongjoon Hyun , dev
Subject: Re: Thoughts on Spark 3 release, or a preview release
+1 Like the idea as a user and a DSv2
+1 Like the idea as a user and a DSv2 contributor.
On Thu, Sep 12, 2019 at 4:10 PM Jungtaek Lim wrote:
> +1 (as a contributor) from me to have preview release on Spark 3 as it
> would help to test the feature. When to cut preview release is
> questionable, as major works are ideally t
+1 (as a contributor) from me to have preview release on Spark 3 as it
would help to test the feature. When to cut preview release is
questionable, as major works are ideally to be done before that - if we are
intended to introduce new features before official release, that should
work regardless
gt;> https://github.com/apache/spark/pull/24851
>>> https://github.com/apache/spark/pull/24297
>>>
>>>michael
>>>
>>>
>>>> On Sep 11, 2019, at 1:37 PM, Sean Owen wrote:
>>>>
>>>> I'm curious what curr
APIs targeting 3.0?
>>
>> https://github.com/apache/spark/pull/24851
>> https://github.com/apache/spark/pull/24297
>>
>>michael
>>
>>
>> On Sep 11, 2019, at 1:37 PM, Sean Owen wrote:
>>
>> I'm curious what current feelings are a
out of curiosity, are the new Spark Graph APIs targeting 3.0?
>
> https://github.com/apache/spark/pull/24851
> https://github.com/apache/spark/pull/24297
>
>michael
>
>
> On Sep 11, 2019, at 1:37 PM, Sean Owen wrote:
>
> I'm curious what current feelings are a
park/pull/24851>
https://github.com/apache/spark/pull/24297
<https://github.com/apache/spark/pull/24297>
michael
> On Sep 11, 2019, at 1:37 PM, Sean Owen wrote:
>
> I'm curious what current feelings are about ramping down towards a
> Spark 3 release. It feels close t
I'm curious what current feelings are about ramping down towards a
Spark 3 release. It feels close to ready. There is no fixed date,
though in the past we had informally tossed around "back end of 2019".
For reference, Spark 1 was May 2014, Spark 2 was July 2016. I'd expect
Sp
I haven't touched Tungsten, but have proposed removing the deprecated old
memory manager and settings -- yes I think that's the primary argument for
it.
https://github.com/apache/spark/pull/23457
On Wed, Jan 9, 2019 at 6:06 PM Erik Erlandson wrote:
> Removing the user facing config seems like a
of the tradeoffs.
>>>
>>> I know we didn't deprecate it, but it's been off by default for a long
>>> time. It could be deprecated, too.
>>>
>>> Same question for spark.memory.useLegacyMode and all its various
>>> associated settings? Seems li
n 03, 2019 at 2:55 PM, Sean Owen wrote:
>
>> Just wondering if there is a good reason to keep around the pre-tungsten
>> on-heap memory mode for Spark 3, and make spark.memory.offHeap.enabled
>> always true? It would simplify the code somewhat, but I don't feel I'm so
>
very few users need it. How much code does it remove though?
On Thu, Jan 03, 2019 at 2:55 PM, Sean Owen < sro...@apache.org > wrote:
>
>
>
> Just wondering if there is a good reason to keep around the pre-tungsten
> on-heap memory mode for Spark 3, and make spark.memor
Just wondering if there is a good reason to keep around the
pre-tungsten on-heap memory mode for Spark 3, and make
spark.memory.offHeap.enabled always true? It would simplify the code
somewhat, but I don't feel I'm so aware of the tradeoffs.
I know we didn't deprecate it, but i
I took a pass at removing most of the older deprecated items in Spark.
For discussion:
https://github.com/apache/spark/pull/22921
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
I'll +1 on removing those legacy mllib code. Many users are confused about the
APIs, and some of them have weird behaviors (for example, in gradient descent,
the intercept is regularized which supports not to).
DB Tsai | Siri Open Source Technologies [not a contribution] | Apple, Inc
> O
My understanding was that the legacy mllib api was frozen, with all new dev
going to ML, but it was not going to be removed. Although removing it would
get rid of a lot of `OldXxx` shims.
On Wed, Oct 17, 2018 at 12:55 AM Marco Gaido wrote:
> Hi all,
>
> I think a very big topic on this would be:
Hi all,
I think a very big topic on this would be: what do we want to do with the
old mllib API? For long I have been told that it was going to be removed on
3.0. Is this still the plan?
Thanks,
Marco
Il giorno mer 17 ott 2018 alle ore 03:11 Marcelo Vanzin
ha scritto:
> Might be good to take a
Might be good to take a look at things marked "@DeveloperApi" and
whether they should stay that way.
e.g. I was looking at SparkHadoopUtil and I've always wanted to just
make it private to Spark. I don't see why apps would need any of those
methods.
On Tue, Oct 16, 2018 at 10:18 AM Sean Owen wrot
There was already agreement to delete deprecated things like Flume and
Kafka 0.8 support in master. I've got several more on my radar, and
wanted to highlight them and solicit general opinions on where we
should accept breaking changes.
For example how about removing accumulator v1?
https://github
;
>> i was expecting to be able to move to scala 2.12 sometime this year
>>
>> if this cannot be done in spark 2.x then that could be a compelling reason
>> to move spark 3 up to 2018 i think
>>
>> hadoop 3 sounds great but personally i have no use case for it yet
> Justin
>
>
> On Jan 19, 2018, at 10:53 AM, Koert Kuipers wrote:
>
> i was expecting to be able to move to scala 2.12 sometime this year
>
> if this cannot be done in spark 2.x then that could be a compelling reason
> to move spark 3 up to 2018 i think
>
> hadoop 3
ne in spark 2.x then that could be a compelling reason to
> move spark 3 up to 2018 i think
>
> hadoop 3 sounds great but personally i have no use case for it yet
>
> On Fri, Jan 19, 2018 at 12:31 PM, Sean Owen <mailto:so...@cloudera.com>> wrote:
> Forking this thread to
i was expecting to be able to move to scala 2.12 sometime this year
if this cannot be done in spark 2.x then that could be a compelling reason
to move spark 3 up to 2018 i think
hadoop 3 sounds great but personally i have no use case for it yet
On Fri, Jan 19, 2018 at 12:31 PM, Sean Owen wrote
everyone else's mind?
>
> On Jan 20, 2018 6:32 AM, "Sean Owen" wrote:
>
>> Forking this thread to muse about Spark 3. Like Spark 2, I assume it
>> would be more about making all those accumulated breaking changes and
>> updating lots of dependencies. Hadoop 3 l
g with then. What's top
> of everyone else's mind?
>
> On Jan 20, 2018 6:32 AM, "Sean Owen" wrote:
>
>> Forking this thread to muse about Spark 3. Like Spark 2, I assume it
>> would be more about making all those accumulated breaking changes and
>>
d see if we can do something with then. What's top
of everyone else's mind?
On Jan 20, 2018 6:32 AM, "Sean Owen" wrote:
> Forking this thread to muse about Spark 3. Like Spark 2, I assume it would
> be more about making all those accumulated breaking changes and updating
Forking this thread to muse about Spark 3. Like Spark 2, I assume it would
be more about making all those accumulated breaking changes and updating
lots of dependencies. Hadoop 3 looms large in that list as well as Scala
2.12.
Spark 1 was release in May 2014, and Spark 2 in July 2016. If Spark
78 matches
Mail list logo