Do we have a Java Client for Spark Connect which is something like PySpark?
From: Mich Talebzadeh
Sent: 22 January 2025 15:05
To: Hyukjin Kwon
Cc: Martin Grund ; Holden Karau
; Dongjoon Hyun ; dev
Subject: [EXTERNAL] Re: FYI: A Hallucination about Spark Connect Stability in
Spark 4
CI
CI broken is really an operational aspect albeit in this case was quote
temporary. We should put that aside and move on as 1) product is sound and
2) spark connect is strategic for the future of Spark.
HTH
Mich Talebzadeh,
Architect | Data Science | Financial Crime | Forensic Analysis | GDPR
While it might be a bit too much to talk about its stability, it is true
that the CI dedicated for Spark Connect compat was broken there for a
couple of weeks, and the errors from the tests look confusing.
I agree that tests and builds could be one of the easiest measurements to
tell the state of a
I'm very confused about how we use stability in CI as a measure to discuss
the strategy of a particular feature, particularly because we call these
"hallucinations."
>From real-world experience, I can say that we have thousands of clients
using Spark Connect across many different versions in our i
Thanks for update and looking into it.
Excuse the thumb typos
On Tue, 21 Jan 2025 at 4:09 PM, Hyukjin Kwon wrote:
> Just a quick note on that: the major reason is 1. OOM we should figure out
> and fix the CI environment. 2. structured streaming test failure that is
> still in development.
> I
I'm passionate about and have lots of experience fixing OOMs. Contact me if
you need some help.
El mié, 22 ene 2025, 1:10, Hyukjin Kwon escribió:
> Just a quick note on that: the major reason is 1. OOM we should figure out
> and fix the CI environment. 2. structured streaming test failure that i
Thank you, Hyukjin!
Dongjoon
On Tue, Jan 21, 2025 at 16:10 Hyukjin Kwon wrote:
> Just a quick note on that: the major reason is 1. OOM we should figure out
> and fix the CI environment. 2. structured streaming test failure that is
> still in development.
> I made an umbrella JIRA (https://issue
Just a quick note on that: the major reason is 1. OOM we should figure out
and fix the CI environment. 2. structured streaming test failure that is
still in development.
I made an umbrella JIRA (https://issues.apache.org/jira/browse/SPARK-50907),
and I will work there. Should be easier to look at w
Let me take a look. shouldn't be a major issue.
On Wed, 22 Jan 2025 at 08:31, Mich Talebzadeh
wrote:
> As discussed on a thread over the weekend, we agreed among us including
> Matei on a shift towards a more stable and version-independent APIs.
> Spark Connect IMO is a key enabler of this shi
As discussed on a thread over the weekend, we agreed among us including
Matei on a shift towards a more stable and version-independent APIs.
Spark Connect IMO is a key enabler of this shift, allowing users and
developers to build applications and libraries that are more resilient to
changes in Sp
To be clear, (1) is `PySpark 4.0 Client` + `Spark 4.0 Server`, which is more
severe.
And, your point matches with (2) exactly. Thank you for your reply, Holden.
Dongjoon.
On 2025/01/21 22:38:20 Holden Karau wrote:
> Interesting. So given one of the features of Spark connect should be
> simpler
Interesting. So given one of the features of Spark connect should be
simpler migrations we should (in my mind) only declare it stable once we’ve
gone through two releases where the previous client + its code can talk to
the new server.
Twitter: https://twitter.com/holdenkarau
Fight Health Insuranc
It seems that there is misinformation about the stability of Spark Connect
in Spark 4. I would like to reduce the gap in our dev mailing list.
Frequently, some people claim `Spark Connect` is stable because it uses
Protobuf. Yes, we standardize the interface layer. However, may I ask if it
implies
13 matches
Mail list logo