+1
On Tue, Mar 25, 2025 at 10:22 PM Ángel Álvarez Pascua <
angel.alvarez.pas...@gmail.com> wrote:
> I meant ... a data validation API would be great, but why in the DSv2?
> isn't data validation something more general? do we have to use DSv2 to
> have our data validated?
>
> El mié, 26 mar 2025,
Sorry Vlad - I disagree. Where is the simple fix? As a new contributor, you
should not be coming in guns blazing blaming committers who are trying to
keep the master branch sane and clean.
On Tue, Mar 25, 2025 at 10:53 PM Rozov, Vlad
wrote:
> There is a simple fix. This is exactly what I outline
I started working on it. See https://github.com/apache/spark/pull/50213. Review
and comments on the PR will help a lot.
+1 for 4.1. It won’t be ready for 4.0 and will require extensive testing.
I have few more local changes that fixes some tests in sql/hive and should
publish another revision s
There is a simple fix. This is exactly what I outlined in the e-mail. Prior to
reverting commit (on master) it was necessary to see if an easy fix exists. The
PR that introduced the error was merged into master 3 weeks ago, so I still
don’t get why it was reverted overnight. It was also necessar
I agree, 4.0 is already in the RC stage and I think it's too late to do
such a big version bump for the Hive dependency.
We definitely need to do this upgrade and thanks for working on it!
On Mon, Mar 24, 2025 at 1:31 PM Ángel Álvarez Pascua <
angel.alvarez.pas...@gmail.com> wrote:
> That's grea
I meant ... a data validation API would be great, but why in the DSv2?
isn't data validation something more general? do we have to use DSv2 to
have our data validated?
El mié, 26 mar 2025, 6:15, Ángel Álvarez Pascua <
angel.alvarez.pas...@gmail.com> escribió:
> For me, data validation is one thi
For me, data validation is one thing, and exporting that data to an
external system is something entirely different. Should data validation be
coupled with the external system? I don't think so. But since I'm the only
one arguing against this proposal, does that mean I'm wrong?
El mié, 26 mar 2025
With the change, the main entry points, Spark shalls, don't work and
developers cannot debug and test. The snapshots become uesless.
The tests passed because you did not fix SBT. It needs a larger change.
Such change cannot be in the source. I can start a vote if you think this
is an issue.
On
Is there a fix already available or a very simple fix a committer can
create quickly? If yes, we can merge the fix. If there isn't, for major
functionality breaking change, we should just revert. That's fairly basic
software engineering practices.
On Tue, Mar 25, 2025 at 9:53 PM Hyukjin Kwon wro
This does not make any sense.
1. There are no broken tests introduced by
https://github.com/apache/spark/pull/49971
2. There are no JIRA filed for “the main entry point”
3. “The main entry point” that does not have any unit test suggests that it is
not the main entry point.
4. It is not practica
I’m glad we’ve found a short-term solution to unblock 4.0, but I’m still
concerned about the long-term solution. It’s definitely better to fix these
tests to generate jar files on the fly rather than relying on pre-compiled
jars in the repo. However, these tests were added a long time ago, and the
I am confused. The consensus is made pretty clearly in
https://github.com/apache/spark/pull/50378, CI passed. Now it has 9 +1s
from all different groups.
Why do we need to change the way? I don't think we should override the
community consensus because you think the approach is hacky.
On Wed, 26 M
I think that there is some miscommunication/misunderstanding, so I’d like to
clarify my view on the issue.
1. I don’t think there is a conflict. I think that overall almost all agree
that having jar files in the Apache source release does not comply with the
Apache release policy and they need
Vlad,
We are conflicted because you immediately want the project to fix the
issue, while Dongjoon stated in the post that he does not want to block the
release just because of this. We delayed the release of Apache Spark 4.0.0
a lot already (going to be month"s" now), and I do not want to see us
e
> Yes, it removes jars from the source release and satisfies the ASF
release policy (see item 3 in my e-mail). At the same time it makes source
release different from the Github including release tag and I don’t think
that in the long term this is the right approach.
For the long term, we should re
Rozov, this broke the main entry points of release, Spark shells. Even in
the mast branch, you build a Spark, and cannot use Spark shells.
Why don't you submit a PR that contains the proper fix? It is easier to
have one PR that has no issue, e.g., reverting backporting etc.
On Wed, 26 Mar 2025 at
Please see inline.
Thank you,
Vlad
On Mar 25, 2025, at 1:42 PM, Hyukjin Kwon wrote:
> - the approach encourages keeping jars files in the Apache Spark repo
Yes, and removes it from source releases. I believe this is a minimized change
with AS-IS?
Yes, it removes jars from the source release a
> - the approach encourages keeping jars files in the Apache Spark repo
Yes, and removes it from source releases. I believe this is a minimized
change with AS-IS?
> - it is hard to identify what tests are impacted by jars so they can be
properly fixed
We have a list of test jars, and I will add th
The policy [1] is quite clear and the fact that other projects do not include
compiled jars (including test jars) into the source release confirms the rule:
"Every ASF release MUST contain one or more source packages, which MUST be
sufficient for a user to build and test the release provided the
I personally think you are reading this too narrowly; the principle is, as
given:
"...MUST contain one or more source packages, which MUST be sufficient for
a user to build and test the release..."
"All releases are in the form of the source materials needed to make
changes to the software being re
I already casted my vote. To clarify, having compiled unlicensed jars in the
source release is strictly against ASF policy [1]. Between a tiny chance that
some tests and functionality will break and a small chance that ASF will
request pull out of a long awaited release due to the policy violati
Hi Ángel,
Thanks for the feedback. Besides the existing NOT NULL constraint, the
proposal suggests enforcing only *check constraints *by default in Spark,
as they’re straightforward and practical to validate at the engine level.
Additionally, the SPIP proposes allowing connectors (like JDBC) to ha
So I think if I understand folks concerns it’s that we’ve let it slide in
the past and at some point we’ve got to stop letting it slide because there
is some concern we might not be meeting the ASF guidance here.
Personally I think given they’re test artifacts and how delayed Spark 4 is
we should
While I'd love to resolve this issue, I still don't understand why we would
block the release for this.
On Tue, Mar 25, 2025 at 7:49 AM Rozov, Vlad
wrote:
> The difference is in the way how tests are disabled.
>
> - the approach encourages keeping jars files in the Apache Spark repo
> - it is
Hi All,
I kind of understand why https://github.com/apache/spark/pull/49971 was
reverted on the branch-4.0 to allow testing of 4.0 release. Why was it also
reverted on the master branch? I don’t see any JIRA being open for the failure.
AFAIK, the proper way to handle the issue in Apache project
The difference is in the way how tests are disabled.
- the approach encourages keeping jars files in the Apache Spark repo
- it is hard to identify what tests are impacted by jars so they can be
properly fixed
- the solution relies on jar being present or not present on the classpath.
Tests may
Just one more variable is Spark 3.5.2 runs on kubernetes and Spark 3.2.0 runs on YARN . It seems kubernetes can be a cause of slowness too .Sent from my iPhoneOn Mar 24, 2025, at 7:10 PM, Prem Gmail wrote:Hello Spark Dev/users,Any one has any clue why and how a better version have performance iss
What's the difference between disabling tests for dev and release vs only
for release?
On Tue, 25 Mar 2025 at 15:36, Rozov, Vlad wrote:
> Overall I don’t buy the solution where tests are skipped based on the
> presence of a jar file. It looks too fragile to me. What if there is a bug
> that does
Just fixed. Thanks guys for the quick fixes proposed. I woke up in my
timezone, and went like wow :-).
On Tue, 25 Mar 2025 at 05:33, Bjørn Jørgensen
wrote:
> https://setuptools.pypa.io/en/latest/history.html#v78-0-2
>
> v78.0.2
> 24 Mar 2025
>
> Bugfixes
> Postponed removals of deprecated dash-s
30 matches
Mail list logo