Hi, All.
This is a kind of head-up as a part of Apache Spark 4.0.0 preparation.
https://issues.apache.org/jira/browse/SPARK-44111
(Prepare Apache Spark 4.0.0)
It would be great if we are able to fix long-standing `Spark Connect` test
flakiness
together during the QA period (2025-02-01 ~) in orde
I would say the short answer is "mostly not" and the longer answer is that
the connect APIs are explicitly not covering many, what we would call,
"paved paths." Because we're more likely to have JAR conflicts with
advanced users who are more likely to use some of the non-supported APIs.
For example
What about introducing isolated class loaders, similar to the approach used
by web servers? Perhaps OSGi bundles or something similar?
El sáb, 18 ene 2025, 22:43, Holden Karau escribió:
> I would say the short answer is "mostly not" and the longer answer is that
> the connect APIs are explicitly
We definitely need to move the “advanced” users to stable APIs if we want Spark
to have a good future, such as the Spark Connect plugin APIs. The RDD API was
the wrong abstraction in my opinion — hopefully I can say that since I worked
on it. It was too tightly bound to Java types and to interna
Yup, it will definitely take a while, but I’d love to start tracing down the
things that prevent people from moving (RDD API is one, but I’m worried there
are also other internal hooks), and also start encouraging library and plugin
developers to use more forward-compatible APIs. Hopefully we ca
That's what I'm hoping for - that going forward we can have more non-JVM
clients (Python, GoLang, Rust, etc.) and make it simpler for JVM-based
clients. I appreciate your call out on 90%/10% Holden - completely fair.
I guess I would just love to see more traction on this so that way we can
minim
BTW, one of many reasons Spark Connect was developed was to potentially
simplify this process around shading (i.e. not need to do it). I’m
wondering if utilizing Spark Connect could be a potential solution here?
On Fri, Jan 17, 2025 at 12:27 Holden Karau wrote:
> +1 I think this is great. If
I think your view highlights the need for a shift towards more stable and
version-independent APIs. Spark Connect IMO is a key enabler of this shift,
allowing users and developers to build applications and libraries that are
more resilient to changes in Spark's internals as opposed to RDDs.
As I s
On 2025/01/18 22:35:59 Mich Talebzadeh wrote:
> I think your view highlights the need for a shift towards more stable and
> version-independent APIs. Spark Connect IMO is a key enabler of this shift,
> allowing users and developers to build applications and libraries that are
> more resilient to ch