The default value of spark.dynamicAllocation.shuffleTracking.enabled was
changed from false to true in Spark 3.4.0, disabling it might help.
[1]
https://spark.apache.org/docs/latest/core-migration-guide.html#upgrading-from-core-33-to-34
Thanks,
Cheng Pan
> On Sep 6, 2024, at 00
, Apache Celeborn [3], a Remote Shuffle Service
for Spark.
Thanks,
Cheng Pan
[1]
https://spark.apache.org/docs/latest/running-on-kubernetes.html#local-storage
[2]
https://github.com/apache/spark/blob/v3.5.2/core/src/main/java/org/apache/spark/shuffle/api/ShuffleDriverComponents.java#L65-L72
[3
org.apache.spark.shuffle.KubernetesLocalDiskShuffleDataIO does NOT support
reliable storage, so the condition 4) is false even with this configuration.
I’m not sure why you think it does.
Thanks,
Cheng Pan
> On Aug 20, 2024, at 18:27, Aaron Grubb wrote:
>
&g
Hi all,
The Apache Kyuubi community is pleased to announce that
Apache Kyuubi 1.9.1 has been released!
This release brings support for Apache Spark 4.0.0-preview1.
Apache Kyuubi is a distributed and multi-tenant gateway to provide
serverless SQL on data warehouses and lakehouses.
Kyuubi provide
-samples/emr-remote-shuffle-service
[4] https://github.com/apache/celeborn/issues/2140
Thanks,
Cheng Pan
> On Apr 6, 2024, at 21:41, Mich Talebzadeh wrote:
>
> I have seen some older references for shuffle service for k8s,
> although it is not clear they are talking about a generic shuff
-innovation-and-long-term-support-lts-versions/
[3] https://github.com/apache/spark/pull/45581
[4] https://aws.amazon.com/rds/mysql/
[5] https://learn.microsoft.com/en-us/azure/mysql/concepts-version-policy
Thanks,
Cheng Pan
-
To
Okay, Let me double-check it carefully.
Thank you very much for your help!
发件人: Jungtaek Lim
发送时间: 2024年3月5日 21:56:41
收件人: Pan,Bingkun
抄送: Dongjoon Hyun; dev; user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released
Yeah the approach seems OK to me - please double
:07
收件人: Pan,Bingkun
抄送: Dongjoon Hyun; dev; user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released
Let me be more specific.
We have two active release version lines, 3.4.x and 3.5.x. We just released
Spark 3.5.1, having a dropdown as 3.5.1 and 3.4.2 given the fact the last
version of 3.4.x is 3.4.2
time of each new document
release.
Of course, if we need to keep the latest in every document, I think it's also
possible.
Only by sharing the same version. json file in each version.
发件人: Jungtaek Lim
发送时间: 2024年3月5日 16:47:30
收件人: Pan,Bingkun
抄送: Dongjoon
According to my understanding, the original intention of this feature is that
when a user has entered the pyspark document, if he finds that the version he
is currently in is not the version he wants, he can easily jump to the version
he wants by clicking on the drop-down box. Additionally, in t
to thank all contributors of the Kyuubi community
who made this release possible!
Thanks,
Cheng Pan, on behalf of Apache Kyuubi community
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Spark has supported the window-based executor failure-tracking mechanism for
YARN for a long time, SPARK-41210[1][2] (included in 3.5.0) extended this
feature to K8s.
[1] https://issues.apache.org/jira/browse/SPARK-41210
[2] https://github.com/apache/spark/pull/38732
Thanks,
Cheng Pan
>
Hi all,
The Apache Kyuubi community is pleased to announce that
Apache Kyuubi 1.8.0 has been released!
Apache Kyuubi is a distributed and multi-tenant gateway to provide
serverless SQL on data warehouses and lakehouses.
Kyuubi provides a pure SQL gateway through Thrift JDBC/ODBC interface
for en
: https://celeborn.apache.org/
Celeborn Resources:
- Issue Management: https://issues.apache.org/jira/projects/CELEBORN
- Mailing List: d...@celeborn.apache.org
Thanks,
Cheng Pan
On behalf of the Apache Celeborn(incubating) community
For the Guava case, you may be interested in
https://github.com/apache/spark/pull/42493
Thanks,
Cheng Pan
> On Aug 14, 2023, at 16:50, Sankavi Nagalingam
> wrote:
>
> Hi Team,
> We could see there are many dependent vulnerabilities present in the latest
> spark-core:3.4.
]
https://github.com/apache/kyuubi/tree/master/extensions/spark/kyuubi-spark-connector-hive
Thanks,
Cheng Pan
On Apr 18, 2023 at 00:38:23, Elliot West wrote:
> Hi Ankit,
>
> While not a part of Spark, there is a project called 'WaggleDance' that
> can federate multiple Hive m
://github.com/apache/spark/pull/38357
Thanks,
Cheng Pan
On Mar 14, 2023 at 16:36:45, 404 wrote:
> hi, all
>
> Spark runs on k8s, uses daemonset filebeat to collect logs, and writes
> them to elasticsearch. The docker logs are in json format, and each line is
> a json string. How to m
...@kyuubi.apache.org
We would like to thank all contributors of the Kyuubi community who
made this release possible!
Thanks,
Cheng Pan, on behalf of Apache Kyuubi community
://issues.apache.org/jira/browse/SPARK-38138
Thanks,
Cheng Pan
On Nov 2, 2022 at 00:14:34, Enrico Minack wrote:
> Hi Tanin,
>
> running your test with option "spark.sql.planChangeLog.level" set to
> "info" or "warn" (depending on your Spark log level) will sh
There are some projects based on Spark DataSource V2 that I hope will help you.
https://github.com/datastax/spark-cassandra-connector
https://github.com/housepower/spark-clickhouse-connector
https://github.com/oracle/spark-oracle
https://github.com/pingcap/tispark
Thanks,
Cheng Pan
On Wed, Apr
cs/latest/deployment/engine_share_level.html
[2] https://github.com/apache/incubator-kyuubi/discussions/925
Thanks,
Cheng Pan
---
Thanks, I'll check it out.
I have a use case where we want to use dbt as data middling tool .
Will it take dbt queries and create the resulting model ?
I see it
Hello Spark Community,
The Apache Kyuubi(Incubating) community is pleased to announce that
Apache Kyuubi(Incubating) 1.3.0-incubating has been released!
Apache Kyuubi(Incubating) is a distributed multi-tenant JDBC server for
large-scale data processing and analytics, built on top of Apache Spark
hi there,
I'm new bee for Spark, recently beginning my learning journey come with
spark 2.0.1. I hit an issue maybe totally simple. When trying to run
SparkPi example in Scala in following command, an exception was thrown. Is
it right behavior or something wrong in my command?
# bin/spark-submit
Hello,
I am trying to understand the cost of converting an RDD to Dataframe and
back. Would a conversion back and forth very frequently cost performance.
I do observe that some operations like join are implemented very differently
for RDD (pair) and Dataframe so trying to figure out the cose of
24 matches
Mail list logo