Not sure this counts as -1, but by cursory checking the code, I found that
the way the TLS connection is set up is not always working:
https://github.com/apache/spark-connect-swift/blob/v0.1.0-rc1/Sources/SparkConnect/DataFrame.swift#L276-L288
Shows that DataFrame operations explicitly set
+1 (non-binding)
G
On Mon, May 5, 2025 at 8:35 AM huaxin gao wrote:
> +1 Thanks Dongjoon.
>
> On Sun, May 4, 2025 at 5:21 PM Dongjoon Hyun wrote:
>
>> +1
>>
>> I checked the checksum and signatures, and tested with Apache Spark 4.0.0
>> RC4 on Swift 6.
un wrote:
>> >
>> >
>> >
>> > +1
>> >
>> > I checked the checksum and signatures, and tested with K8s v1.32.
>> >
>> > Dongjoon.
>> >
>> > On 2025/05/04 23:58:54 Zhou Jiang wrote:
>> >> +1 , thanks
4 23:58:54 Zhou Jiang wrote:
>> >> +1 , thanks for driving this release!
>> >>
>> >> *Zhou JIANG*
>> >>
>> >>
>> >>
>> >> On Sun, May 4, 2025 at 16:58 Dongjoon Hyun > >> <mailto:dongjoon.h...@gma
+1 (non-binding)
Kazu
> On May 4, 2025, at 11:31 PM, huaxin gao wrote:
>
> +1 Thanks Dongjoon.
>
> On Sun, May 4, 2025 at 5:21 PM Dongjoon Hyun <mailto:dongj...@apache.org>> wrote:
>> +1
>>
>> I checked the checksum and signatures, and tested wi
gt; Dongjoon.
> >
> > On 2025/05/04 23:58:54 Zhou Jiang wrote:
> >> +1 , thanks for driving this release!
> >>
> >> *Zhou JIANG*
> >>
> >>
> >>
> >> On Sun, May 4, 2025 at 16:58 Dongjoon Hyun
> wrote:
> >>
> >>
+1 Thanks Dongjoon.
On Sun, May 4, 2025 at 5:21 PM Dongjoon Hyun wrote:
> +1
>
> I checked the checksum and signatures, and tested with Apache Spark 4.0.0
> RC4 on Swift 6.1.
>
> This is the initial release (v0.1) with 105 patches to provide a tangible
> release to the use
+1
On Sun, May 4, 2025 at 3:15 PM Dongjoon Hyun wrote:
>
> Please vote on releasing the following candidate as Apache Spark Connect
> Swift Client 0.1.0. This vote is open for the next 72 hours and passes if a
> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>
&
+1
On Sun, May 4, 2025 at 4:58 PM Dongjoon Hyun wrote:
>
> Please vote on releasing the following candidate as Apache Spark K8s Operator
> 0.1.0. This vote is open for the next 72 hours and passes if a majority +1
> PMC votes are cast, with a minimum of 3 +1 votes.
>
> [
> On Sun, May 4, 2025 at 16:58 Dongjoon Hyun wrote:
>>
>>> Please vote on releasing the following candidate as Apache Spark K8s
>>> Operator 0.1.0. This vote is open for the next 72 hours and passes if a
>>> majority +1 PMC votes are cast, with a minimum of 3 +1 v
easing the following candidate as Apache Spark K8s
> > Operator 0.1.0. This vote is open for the next 72 hours and passes if a
> > majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
> >
> > [ ] +1 Release this package as Apache Spark K8s Operator 0.1.0
> >
+1
I checked the checksum and signatures, and tested with Apache Spark 4.0.0 RC4
on Swift 6.1.
This is the initial release (v0.1) with 105 patches to provide a tangible
release to the users.
v0.2 is under planning in SPARK-51999.
Dongjoon.
On 2025/05/04 22:14:54 Dongjoon Hyun wrote
+1 , thanks for driving this release!
*Zhou JIANG*
On Sun, May 4, 2025 at 16:58 Dongjoon Hyun wrote:
> Please vote on releasing the following candidate as Apache Spark K8s
> Operator 0.1.0. This vote is open for the next 72 hours and passes if a
> majority +1 PMC votes are cas
Please vote on releasing the following candidate as Apache Spark K8s
Operator 0.1.0. This vote is open for the next 72 hours and passes if a
majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
[ ] +1 Release this package as Apache Spark K8s Operator 0.1.0
[ ] -1 Do not release this
Please vote on releasing the following candidate as Apache Spark Connect
Swift Client 0.1.0. This vote is open for the next 72 hours and passes if a
majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
[ ] +1 Release this package as Apache Spark Connect Swift Client 0.1.0
[ ] -1 Do not
Does the following options works for you?
./bin/spark-shell --conf spark.jars.ivy=${HOME}/.ivy2
./bin/spark-shell --conf spark.jars.ivy=/Users/yourname/.ivy2
I think the issue is that ~ is not interpreted by shell and just passthrough to
the Ivy lib.
Thanks,
Cheng Pan
> On Apr 29, 2025,
Hi Jacek,
Thanks for the confirmation! Let's change the wording first, and open a
JIRA ticket for the relative path support.
Wenchen
On Tue, Apr 29, 2025 at 2:41 AM Jacek Laskowski wrote:
> Hi Wenchen,
>
> Looks like it didn't work in 3.5 either.
>
> ❯ ./bin/spark-s
Hi Wenchen,
Looks like it didn't work in 3.5 either.
❯ ./bin/spark-shell --version
25/04/28 20:37:48 WARN Utils: Your hostname, Jaceks-Mac-mini.local resolves
to a loopback address: 127.0.0.1; using 192.168.68.100 instead (on
interface en1)
25/04/28 20:37:48 WARN Utils: Set SPARK_LOCAL_IP i
Hi Jacek,
Thanks for reporting the issue! Did you hit the same problem when you set
the `spark.jars.ivy` config with Spark 3.5? If this config never worked
with a relative path, we should change the wording in the migration guide.
Thanks,
Wenchen
On Sun, Apr 27, 2025 at 10:27 PM Jacek Laskowski
Hi,
I found in docs/core-migration-guide.md:
- Since Spark 4.0, Spark uses `~/.ivy2.5.2` as Ivy user directory by
default to isolate the existing systems from Apache Ivy's incompatibility.
To restore the legacy behavior, you can set `spark.jars.ivy` to `~/.ivy2`.
With that, I
One more small fix (on another topic) for the next RC:
https://github.com/apache/spark/pull/50685
Thanks!
Szehon
On Tue, Apr 22, 2025 at 10:07 AM Rozov, Vlad
wrote:
> Correct, to me it looks like a Spark bug
> https://issues.apache.org/jira/browse/SPARK-51821 that may be hard to
> tr
Correct, to me it looks like a Spark bug
https://issues.apache.org/jira/browse/SPARK-51821 that may be hard to trigger
and is reproduce using the test case provided in
https://github.com/apache/spark/pull/50594:
1. Spark UninterruptibleThread “task” is interrupted by “test” thread while
“task
Correct me if I'm wrong: this is a long-standing Spark bug that is very
hard to trigger, but the new Parquet version happens to hit the trigger
condition and exposes the bug. If this is the case, I'm +1 to fix the Spark
bug instead of downgrading the Parquet version.
Let's mov
I don't think PARQUET-2432 has any issue itself. It looks to have triggered
a deadlock case like https://github.com/apache/spark/pull/50594.
I'd suggest that we fix forward if possible.
Thanks,
Manu
On Mon, Apr 21, 2025 at 11:19 PM Rozov, Vlad
wrote:
> The deadlock is reprodu
The deadlock is reproducible without Parquet. Please see
https://github.com/apache/spark/pull/50594.
Thank you,
Vlad
On Apr 21, 2025, at 1:59 AM, Cheng Pan wrote:
The deadlock is introduced by PARQUET-2432(1.14.0), if we decide downgrade, the
latest workable version is Parquet 1.13.1
The deadlock is introduced by PARQUET-2432(1.14.0), if we decide downgrade, the
latest workable version is Parquet 1.13.1.
Thanks,
Cheng Pan
> On Apr 21, 2025, at 16:53, Wenchen Fan wrote:
>
> +1 to downgrade to Parquet 1.15.0 for Spark 4.0. According to
> https://github.com/
+1 to downgrade to Parquet 1.15.0 for Spark 4.0. According to
https://github.com/apache/spark/pull/50583#issuecomment-2815243571 , the
Parquet CVE does not affect Spark.
On Mon, Apr 21, 2025 at 2:45 PM Hyukjin Kwon wrote:
> That's nice but we need to wait for them to release, and upgra
It seems this patch(https://github.com/apache/parquet-java/pull/3196) can
avoid deadlock issue if using Parquet 1.15.1.
On Wed, Apr 16, 2025 at 5:39 PM Niranjan Jayakar
wrote:
> I found another bug introduced in 4.0 that breaks Spark connect client x
> server compatibility: https://gith
uet-java/pull/3196) can
> avoid deadlock issue if using Parquet 1.15.1.
>
> On Wed, Apr 16, 2025 at 5:39 PM Niranjan Jayakar
> wrote:
>
>> I found another bug introduced in 4.0 that breaks Spark connect client x
>> server compatibility: https://github.com/apache/spark/
I found another bug introduced in 4.0 that breaks Spark connect client x
server compatibility: https://github.com/apache/spark/pull/50604.
Once merged, this should be included in the next RC.
On Thu, Apr 10, 2025 at 5:21 PM Wenchen Fan wrote:
> Please vote on releasing the following candid
It may not be the Parquet introduced issue. It looks like a race condition
between Spark UninterruptibleThread and Hadoop/HDFS DFSOutputStream. I tried to
resolve the deadlock in https://github.com/apache/spark/pull/50594. Can you
give it a try? I will see if I can reproduce the deadlock in a
ava.base@17.0.6/Thread.java:833)
Found 1 deadlock.
On Mon, Apr 14, 2025 at 11:13 AM Hyukjin Kwon wrote:
> Made a fix at https://github.com/apache/spark/pull/50575 👍
>
> On Mon, 14 Apr 2025 at 11:42, Wenchen Fan wrote:
>
>> I'm testing the new spark-connect distribution
s working on, or are you still
> investigating it? If the issue is confirmed by the Parquet community, we
> can probably roll back to the previous Parquet version for Spark 4.0.
>
> Thanks,
> Wenchen
>
> On Tue, Apr 15, 2025 at 7:24 AM Yuming Wang wrote:
>
>> This rel
Hi Yuming,
1.51.1 is the latest release of Apache Parquet for the 1.x line. Is it a
known issue the Parquet community is working on, or are you still
investigating it? If the issue is confirmed by the Parquet community, we
can probably roll back to the previous Parquet version for Spark 4.0
Made a fix at https://github.com/apache/spark/pull/50575 👍
On Mon, 14 Apr 2025 at 11:42, Wenchen Fan wrote:
> I'm testing the new spark-connect distribution and here is the result:
>
> 4 packages are tested: pip install pyspark, pip install pyspark_connect (I
> installed
I'm testing the new spark-connect distribution and here is the result:
4 packages are tested: pip install pyspark, pip install pyspark_connect (I
installed them with the RC4 pyspark tarballs), the classic tarball
(spark-4.0.0-bin-hadoop3.tgz), the connect tarball
(spark-4.0.0-bin-hadoop3-
{
"emoji": "👍",
"version": 1
}
performance needs to be checked.With YARN and External Spark Shuffle, the sparkshuffle is a lot more optimized, so we can experience slowness with spark on k8s, especially if there is a pod restart. Have you checked Apache Uniffle / Celeborn for enabling spark shuffle ?
fyi .. i'm
Pls check if there are resource constraints on the pods/nodes especially if
they are shared.
MinIO connectivity performance needs to be checked.
With YARN and External Spark Shuffle, the sparkshuffle is a lot more
optimized, so we can experience slowness with spark on k8s, especially if
there is
Hello Karan,I am using Spark open source in kubernetes and Spark mapr bundle in YARN.For launching job in both approach it takes same 10 secs .For shuffle I am using local in both yarn and kubernetes.Sent from my iPhoneOn Apr 11, 2025, at 11:24 AM, karan alang wrote:Hi Prem,Which distribution of
Hi Prem,
Which distribution of Spark are you using ?
how long does it take to launch the job ?
wrt Spark Shuffle, what is the approach you are using - storing shuffle
data in MinIO or using host path ?
regds,
Karan
On Fri, Apr 11, 2025 at 4:58 AM Prem Sahoo wrote:
> Hello Team,
> I
Hello Team,
I have a peculiar case of Spark slowness.
I am using Minio as Object storage from where Spark reads & writes data. I
am using YARN as Master and executing a Spark job which takes ~5mins the
same job when run with Kubernetes as Master it takes ~8 mins .
I checked the Spark DAG in
Please vote on releasing the following candidate as Apache Spark version
4.0.0.
The vote is open until April 15 (PST) and passes if a majority +1 PMC votes
are cast, with a minimum of 3 +1 votes.
[ ] +1 Release this package as Apache Spark 4.0.0
[ ] -1 Do not release this package because ...
To
this proposal now
... 😂
*"If you haven’t encountered this kind of ‘dependency hell’ while working
on geospatial projects with Spark, you may have been fortunate to deal with
relatively simple cases."*
Yes, that was the case for us. We loaded OpenStreetMap data from Spain,
calculated some Have
I've noticed that the check is set in *scalastyle-config.xml*:
true
Given this configuration, how is it possible that some people have been
able to commit changes violating this rule? Moreover, how were these
changes even merged despite failing this validation? It seems like
you,
Vlad
On Mar 26, 2025, at 3:18 PM, Hyukjin Kwon wrote:
That only fixes Maven. Both SBT build and Maven build should work in the same
or similar wat. Let's make sure both work.
On Thu, Mar 27, 2025 at 3:18 AM Rozov, Vlad wrote:
Please see https://github.com/vrozov/spark/tree/spark-she
#Options_to_Tune
But you need that underlying hosting infra to be the same before making
comparisons about the layers above. Why not start by either replicating
your previous setup in k8s or running spark 3.5 standalone outside k8s and
comparing it to spark 3.2 in the same environment?
On Tue, 25 Mar 2025 at
Hi Wenchen,
Could you please wait for https://github.com/apache/spark/pull/50246 to be
merged before you cut the next RC?
Thanks,
Huaxin
On Mon, Mar 31, 2025 at 8:53 PM Wenchen Fan wrote:
> Hi all,
>
> Thanks for your feedback! Regarding
> https://github.com/apache/spark/pull/501
Hi all,
Thanks for your feedback! Regarding
https://github.com/apache/spark/pull/50187 , I don't think it's a 4.0
blocker as it's a CI issue for the examples module. Other than that, all
other issues have been resolved and I'll cut the next RC after
https://github.com/apache
believe it’s important to standardize common data types in Spark and
clearly define the boundaries between different layers in the Lakehouse
ecosystem.
While it makes sense for Apache Sedona to have its own Parquet data source
for geospatial types in the absence of a standard, the long-term goal
. I'd be curious
about what those numbers are -though they only measure task/job commit, not
all the work (that's not quite true, but...)
You can get a log of all S3 IO performed for an entire Spark job across all
worker threads, via the S3 auditing,
https://hadoop.apache.org/docs/stable/
again for the expertise from Sedona side in these
efforts.
Thanks!
Szehon
Sent from my iPhone
> On Mar 29, 2025, at 11:42 PM, Jia Yu wrote:
>
> Hi Reynold and team,
>
> I’m glad to see that the Spark community is recognizing the importance
> of geospatial support. The Se
Hey Angel,
I am glad that you asked these questions. Please see my answers below.
*1. Domain types evolve quickly. - It has taken years for Parquet to
include these new types in its format... We could evolve alongside Parquet.
Unfortunately, Spark is not known for upgrading its dependencies
Hi Reynold and team,
I’m glad to see that the Spark community is recognizing the importance
of geospatial support. The Sedona community has long been a strong
advocate for Spark, and we’ve proudly supported large-scale geospatial
workloads on Spark for nearly a decade. We’re absolutely open to
* 1. Domain types evolve quickly.*
It has taken years for Parquet to include these new types in its format...
We could evolve alongside Parquet. Unfortunately, Spark is not known for
upgrading its dependencies quickly.
* 2. Geospatial in Java and Python is a dependency hell.*
How has
While I don’t think Spark should become a super specialized geospatial
processing engine, I don’t think it makes sense to focus *only* on reading
and writing from storage. Geospatial is a pretty common and fundamental
capability of analytics systems and virtually every mature and popular
analytics
Sedona community.
Since the primary motivation here is Parquet-level support, I suggest shifting
the focus of this discussion toward enabling geo support in Spark Parquet
DataSource rather than introducing core types.
** Why Spark Should Avoid Hardcoding Domain-Specific Types like geo types
minimal support in Spark, as a common platform, for these types.
To be more specific and explicit: The proposal scope is to add support for
reading/writing to Parquet, based on the new standard, as well as adding the
types as built-in types in Spark to complement the storage support. The few ST
, now that the types are in most common data sources in ecosystem , I think Apache Spark as a common platform needs to have this type definition for inter-op, otherwise users of vanilla Spark cannot work with those data sources with stored geospatial data. (Imo a similar rationale in adding timestamp
Hello Jia,
Wenchen summarized the intent very clearly. The scope of the proposal is
primarily the type system and storage, not processing. Let’s work together on
the technical details and make sure the work we propose to do in Spark works
best with Apache Sedona.
Best,
Menelaos
> On Mar
Hi Jia,
This is a good question. As the shepherd of this SPIP, I'd like to clarify
the motivation here: the focus of this project is more about the storage
part, not the processing. Apache Sedona is a great library for geo
processing, but without native geo type support in Spark, users can
>> /WKB
>> <https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry#Well-known_binary>
>> ?
>>
>> El vie, 28 mar 2025 a las 20:50, Ángel Álvarez Pascua (<
>> angel.alvarez.pas...@gmail.com>) escribió:
>>
>>> +1 (non-bindin
+1 (non-binding)
El vie, 28 mar 2025, 18:48, Menelaos Karavelas
escribió:
> Dear Spark community,
>
> I would like to propose the addition of new geospatial data types
> (GEOMETRY and GEOGRAPHY) which represent geospatial values as recently
> added as new logical types
framework for processing large-scale geospatial data on Spark, Flink, and other
engines.
>From what I understand, this proposal aims to add native geospatial types and
>functionality directly into Spark. However, this seems to replicate much of
>the work already done by the Sedona pro
text_representation_of_geometry#Well-known_binary>
> ?
>
> El vie, 28 mar 2025 a las 20:50, Ángel Álvarez Pascua (<
> angel.alvarez.pas...@gmail.com>) escribió:
>
>> +1 (non-binding)
>>
>> El vie, 28 mar 2025, 18:48, Menelaos Karavelas <
>> menelao
rez Pascua
> (mailto:angel.alvarez.pas...@gmail.com>>)
> escribió:
>> +1 (non-binding)
>>
>> El vie, 28 mar 2025, 18:48, Menelaos Karavelas > <mailto:menelaos.karave...@gmail.com>> escribió:
>>> Dear Spark community,
>>>
>>> I w
as...@gmail.com>) escribió:
> +1 (non-binding)
>
> El vie, 28 mar 2025, 18:48, Menelaos Karavelas <
> menelaos.karave...@gmail.com> escribió:
>
>> Dear Spark community,
>>
>> I would like to propose the addition of new geospatial data types
>> (GEOMET
Dear Spark community,
I would like to propose the addition of new geospatial data types (GEOMETRY and
GEOGRAPHY) which represent geospatial values as recently added as new logical
types in the Parquet specification.
The new types should improve Spark’s ability to read the new Parquet logical
https://github.com/apache/spark/pull/50437
IMO, it will be better to keep 2 separate commits, one undo revert and one fix,
so fix for guava is properly documented.
Also, while testing, I see that if I exit the shell and start it again, it
fails.
Thank you,
Vlad
On Mar 27, 2025, at 2:33 PM
Hi,
I'm trying to build the project, but I'm encountering multiple errors due
to long lines. Is this expected? I built the project a few weeks ago and
don’t recall seeing these errors.
Is anyone else experiencing the same issue?
[image: image.png]
Thanks in advance.
Vlad, let's open a PR and discuss it there. We have many other committees
to review / help with as well.
On Fri, Mar 28, 2025 at 6:28 AM Rozov, Vlad
wrote:
> Hi Hyukjin,
>
> I open https://issues.apache.org/jira/browse/SPARK-51643 and
> https://issues.apache.org/jira/browse/SP
Hi Hyukjin,
I open https://issues.apache.org/jira/browse/SPARK-51643 and
https://issues.apache.org/jira/browse/SPARK-51644. Please add more details to
the first JIRA. As far as I can see
https://github.com/vrozov/spark/tree/spark-shell should fix both JIRAs and if
not I’d like to understand
Back in the very early days of Spark (before it was even an Apache
Incubator project), Maven was clearly a more mature, capable and
stable tool suite for building, testing and publishing JVM code, even
Scala code, so some of the earliest commercial adopters of Spark
relied upon Maven. It made
Here's a bit of history and context:
The project was initially built using SBT (
https://github.com/apache/spark/commit/df29d0ea4c8b7137fdd1844219c7d489e3b0d9c9
).
Later, Maven support was added (
https://github.com/apache/spark/commit/811a32257b1b59b042a2871eede6ee39d9e8a137
)
to provi
A slightly off-topic but related question: It feels fragile to test with
SBT while publishing the release with Maven. How did we end up in this
situation? Moreover, since most Spark developers use SBT for their daily
work, it becomes even harder to catch issues with the Maven build.
On Thu, Mar
Nah, I wasn't clear.
Maven and SBT builds are synced for this special code path, e.g.,
https://github.com/apache/spark/commit/e927a7edad47f449aeb0d5014b6185ac36b344d0
.
If you build Maven and SBT, the results are almost the same.
Now, the fix you landed in Maven (and indeed it was a
://github.com/apache/spark/commit/e927a7edad47f449aeb0d5014b6185ac36b344d0):
diff```
-
- com.google.common
-
${spark.shade.packageName}.connect.guava
-
-com.google.common.**
-
-
```
The companion part of this
https://github.com/apache/spark/commit/e927a7edad47f449aeb0d5014b6185ac36b344d0):
diff```
-
- com.google.common
-
${spark.shade.packageName}.connect.guava
-
-com.google.common.**
-
-
```
The companion p
lines you changed (added in
> https://github.com/apache/spark/commit/e927a7edad47f449aeb0d5014b6185ac36b344d0
> ):
>
> diff```
> -
> - com.google.common
> -
> ${spark.shade.packageName}.connect.guava
> -
> -co
It is not broken. The fix you applied would not be applied in SBT. For
example, the lines you changed (added in
https://github.com/apache/spark/commit/e927a7edad47f449aeb0d5014b6185ac36b344d0
):
diff```
-
- com.google.common
-
${spark.shade.packageName}.connect.guava
+1 on explanation that it is not happening only to Vlad but always
happening as a normal process.
Vlad, if we are very strict about ASF voting policy, we have to have
three +1s without -1 to merge the code change. I don't think the major
projects in ASF follow it - instead, they (including
That only fixes Maven. Both SBT build and Maven build should work in the
same or similar wat. Let's make sure both work.
On Thu, Mar 27, 2025 at 3:18 AM Rozov, Vlad
wrote:
> Please see https://github.com/vrozov/spark/tree/spark-shell. I tested
> only spark-shell —remote local aft
filed JIRA, please
provide the link, if not, please open one.
It took me 2 hours to fix Spark shells, so should you open JIRA instead of
spending time to identify the commit and reverting it, you will save time as
well. I’ll post fix once JIRA is open and I validate that my understanding of
Please see https://github.com/vrozov/spark/tree/spark-shell. I tested only
spark-shell —remote local after building with maven and sbt. It may not be a
complete fix and there is no PR. I’ll look into SBT build issue (assuming that
there is still one after the fix) once you file JIRA.
Thank you
> would expect you to open JIRA and outline what is broken. If you filed JIRA,
> please provide the link, if not, please open one.
>
> It took me 2 hours to fix Spark shells, so should you open JIRA instead of
> spending time to identify the commit and reverting it, you will sa
Hello Team,
I was working with Spark 3.2 and Hadoop 2.7.6 and writing to MinIO object
storage . It was slower when compared to writing to MapR FS with the above
tech stack. Then moved on to a later upgraded version of Spark 3.5.2 and
Hadoop 4.3.1 which started writing to MinIO with V2
line what is broken. If you
> filed JIRA, please provide the link, if not, please open one.
>
> It took me 2 hours to fix Spark shells, so should you open JIRA instead of
> spending time to identify the commit and reverting it, you will save time
> as well. I’ll post fix once JIRA is open
not sure I follow your question. I’ll open PR with the fix
> once JIRA is open.
>
> While I am new to the Spark community, I am not new to the Apache projects
> and open source. Committers are guardians for commits and they keep not
> only master branch, but the entire source code in shap
Reynold, I am not sure I follow your question. I’ll open PR with the fix once
JIRA is open.
While I am new to the Spark community, I am not new to the Apache projects and
open source. Committers are guardians for commits and they keep not only master
branch, but the entire source code in shape
://github.com/apache/spark/commit/e927a7edad47f449aeb0d5014b6185ac36b344d0
.
Should also test Spark shells, and describe how you tested it as well.
This is what I expect:
- Please show me if there is a simple fix. If that's the case, yes, I will
revert this out from the master branch. That works for me.
r can
> create quickly? If yes, we can merge the fix. If there isn't, for major
> functionality breaking change, we should just revert. That's fairly basic
> software engineering practices.
>
>
> On Tue, Mar 25, 2025 at 9:53 PM Hyukjin Kwon wrote:
>
>> With the c
quickly? If yes, we can merge the fix. If there isn't, for major functionality
breaking change, we should just revert. That's fairly basic software
engineering practices.
On Tue, Mar 25, 2025 at 9:53 PM Hyukjin Kwon
mailto:gurwls...@apache.org>> wrote:
With the change, the main ent
With the change, the main entry points, Spark shalls, don't work and
developers cannot debug and test. The snapshots become uesless.
The tests passed because you did not fix SBT. It needs a larger change.
Such change cannot be in the source. I can start a vote if you think this
is an issue.
ukjin Kwon wrote:
> With the change, the main entry points, Spark shalls, don't work and
> developers cannot debug and test. The snapshots become uesless.
>
> The tests passed because you did not fix SBT. It needs a larger change.
>
> Such change cannot be in the source. I c
This does not make any sense.
1. There are no broken tests introduced by
https://github.com/apache/spark/pull/49971
2. There are no JIRA filed for “the main entry point”
3. “The main entry point” that does not have any unit test suggests that it is
not the main entry point.
4. It is not
Wenchen
On Wed, Mar 26, 2025 at 11:02 AM Hyukjin Kwon wrote:
> I am confused. The consensus is made pretty clearly in
> https://github.com/apache/spark/pull/50378, CI passed. Now it has 9 +1s
> from all different groups.
> Why do we need to change the way? I don't think we
I am confused. The consensus is made pretty clearly in
https://github.com/apache/spark/pull/50378, CI passed. Now it has 9 +1s
from all different groups.
Why do we need to change the way? I don't think we should override the
community consensus because you think the approach is hacky.
On We
to be removed.
2. The question is when and how to remove them. My initial assumption was that
jars would be removed as part of 4.1.0 and backported to 3.5.x.
3. With the above assumption I voted -0 on 3.5.5 and open
https://github.com/apache/spark/pull/50231 WIP PR with the plan to still vote
-0
Vlad,
We are conflicted because you immediately want the project to fix the
issue, while Dongjoon stated in the post that he does not want to block the
release just because of this. We delayed the release of Apache Spark 4.0.0
a lot already (going to be month"s" now), and I do not want
ch, I disagree and commit. Note
that I still have an outstanding comment on your PR
https://github.com/apache/spark/pull/50378#discussion_r2012935532.
My PR does not cause the issue. I keep it AS IS, and fix the issue raised
in the thread. Let's not mix other issues orthogonal with my PR.
For s
1 - 100 of 3938 matches
Mail list logo