Sorry, but I still don’t follow. My PR broke Maven and the fix I provided fixes 
Maven. SBT was never broken except there is inconsistency between SBT and Maven 
builds. Can the inconsistency be fixed in a follow up PR?

Thank you,

Vlad

On Mar 26, 2025, at 5:57 PM, Hyukjin Kwon <gurwls...@apache.org> wrote:

It is not broken ... because we run SBT in PR builders for ASF resource 
restrictions and faster build. We use Maven for release so it was found out now.

CI did not test your change. The part you are fixing is a special path ..

On Thu, Mar 27, 2025 at 9:53 AM Rozov, Vlad <vro...@amazon.com.invalid> wrote:
If it is not broken, can the sync between maven and SBT dependencies/shadow be 
done in a follow up PR?

Thank you,

Vlad


On Mar 26, 2025, at 5:44 PM, Hyukjin Kwon 
<gurwls...@apache.org<mailto:gurwls...@apache.org>> wrote:

It is not broken. The fix you applied would not be applied in SBT. For example, 
the lines you changed (added in 
https://github.com/apache/spark/commit/e927a7edad47f449aeb0d5014b6185ac36b344d0):

diff```
-            <relocation>
-              <pattern>com.google.common</pattern>
-              
<shadedPattern>${spark.shade.packageName}.connect.guava</shadedPattern>
-              <includes>
-                <include>com.google.common.**</include>
-              </includes>
-            </relocation>
```

The companion part of this has to be removed in SBT as well, for example here 
https://github.com/apache/spark/commit/e927a7edad47f449aeb0d5014b6185ac36b344d0#diff-6f545c33f2fcc975200bf208c900a600a593ce6b170180f81e2f93b3efb6cb3eR674

Adding simple dependency changes usually are shared between SBT and Maven so it 
is not a tricky part. However, some changes like this have to be made in both 
places.
Both Maven and SBT builds are usually sync, and changes are mode together. Let 
me know if I am underding your change wrongly.


On Thu, 27 Mar 2025 at 09:25, Rozov, Vlad <vro...@amazon.com.invalid> wrote:
Hi Hyukjin,

Can you clarify what is broken in SBT? As I already mentioned in my previous 
e-mail, I tested both maven and SBT builds and both works. Is there a test that 
I can run that shows what is broken? Please file JIRA, so we can communicate on 
the JIRA and document what is broken.

Thank you,

Vlad

On Mar 26, 2025, at 3:18 PM, Hyukjin Kwon 
<gurwls...@apache.org<mailto:gurwls...@apache.org>> wrote:

That only fixes Maven. Both SBT build and Maven build should work in the same 
or similar wat. Let's make sure both work.

On Thu, Mar 27, 2025 at 3:18 AM Rozov, Vlad <vro...@amazon.com.invalid> wrote:
Please see https://github.com/vrozov/spark/tree/spark-shell. I tested only 
spark-shell —remote local after building with maven and sbt. It may not be a 
complete fix and there is no PR. I’ll look into SBT build issue (assuming that 
there is still one after the fix) once you file JIRA.

Thank you,

Vlad


On Mar 26, 2025, at 8:27 AM, Hyukjin Kwon 
<gurwls...@apache.org<mailto:gurwls...@apache.org>> wrote:

Vlad,

- Please show me if there is a simple fix. If that's the case, yes, I will 
revert this out from the master branch. That works for me.
- If not, let's make a new PR.
- If you feel this is an issue, let's start a vote. Let me know.


On Thu, 27 Mar 2025 at 00:13, Rozov, Vlad <vro...@amazon.com.invalid> wrote:
Every graduated from incubating Apache project has guards against what you name 
“chaotic” and what other name breaking best development practices. Such guards 
include JIRA, unit tests and PR review. Instead of reverting commit, I would 
expect you to open JIRA and outline what is broken. If you filed JIRA, please 
provide the link, if not, please open one.

It took me 2 hours to fix Spark shells, so should you open JIRA instead of 
spending time to identify the commit and reverting it, you will save time as 
well. I’ll post fix once JIRA is open and I validate that my understanding of 
what is broken is the same as yours.

Thank you,

Vlad

On Mar 26, 2025, at 12:01 AM, Hyukjin Kwon 
<gurwls...@apache.org<mailto:gurwls...@apache.org>> wrote:

Rozov, please test the patch, see if there is a relevant test or not, and add a 
test if not there. If it is difficult to add a test, describe it in the PR 
description, and how you manually tested.
This is what I think you need to do instead of reverting the revert.
Imagine that there are many of such PRs living in the master branch. Thankfully 
this time you are alone but imagine many people do the same things - it will be 
chaotic. To me, it took 4 hours to identify this single commit.

About the change itself, you need to fix SBT together, and it will need the 
changes such as 
https://github.com/apache/spark/commit/e927a7edad47f449aeb0d5014b6185ac36b344d0.
Should also test Spark shells, and describe how you tested it as well.

This is what I expect:
- Please show me if there is a simple fix. If that's the case, yes, I will 
revert this out from the master branch. That works for me.
- If not, let's make a new PR.
- If you feel this is an issue, let's start a vote. Let me know.



On Wed, 26 Mar 2025 at 15:07, Reynold Xin <r...@databricks.com.invalid> wrote:
Sorry Vlad - I disagree. Where is the simple fix? As a new contributor, you 
should not be coming in guns blazing blaming committers who are trying to keep 
the master branch sane and clean.

On Tue, Mar 25, 2025 at 10:53 PM Rozov, Vlad <vro...@amazon.com.invalid> wrote:
There is a simple fix. This is exactly what I outlined in the e-mail. Prior to 
reverting commit (on master) it was necessary to see if an easy fix exists. The 
PR that introduced the error was merged into master 3 weeks ago, so I still 
don’t get why it was reverted overnight. It was also necessary to open JIRA and 
provide reproducible steps. I assume that steps on the PR is what needs to be 
fixed, but it will be better to avoid guessing.

Thank you,

Vlad

On Mar 25, 2025, at 10:05 PM, Reynold Xin <r...@databricks.com.INVALID> wrote:

Is there a fix already available or a very simple fix a committer can create 
quickly? If yes, we can merge the fix. If there isn't, for major functionality 
breaking change, we should just revert. That's fairly basic software 
engineering practices.


On Tue, Mar 25, 2025 at 9:53 PM Hyukjin Kwon 
<gurwls...@apache.org<mailto:gurwls...@apache.org>> wrote:
With the change, the main entry points, Spark shalls, don't work and developers 
cannot debug and test. The snapshots become uesless.

The tests passed because you did not fix SBT. It needs a larger change.

Such change cannot be in the source. I can start a vote if you think this is an 
issue.


On Wed, Mar 26, 2025 at 12:32 PM Rozov, Vlad <vro...@amazon.com.invalid> wrote:
This does not make any sense.

1. There are no broken tests introduced by 
https://github.com/apache/spark/pull/49971
2. There are no JIRA filed for “the main entry point”
3. “The main entry point” that does not have any unit test suggests that it is 
not the main entry point.
4. It is not practical to revert every commit that introduced some functional 
issue or regression. If regression can’t be fixed, it is reverted, if it can be 
reasonably fixed, the fix is applied to the regression. Just as an example, 
https://github.com/apache/spark/pull/42823 introduced a regression. It was 
fixed by https://github.com/apache/spark/pull/50348. Nobody asked for the 
revert.
5. The way how revert was handled is not "Apache way”.

It may be easier for you to handle it in a single PR, but it is easier for me 
to provide PR on top of existing changes and it will also provide JIRA and a 
commit that documents why there is an issue. Otherwise it will be easier to 
introduce the error back again.

Thank you,

Vlad


On Mar 25, 2025, at 4:58 PM, Hyukjin Kwon 
<gurwls...@apache.org<mailto:gurwls...@apache.org>> wrote:

Rozov, this broke the main entry points of release, Spark shells. Even in the 
mast branch, you build a Spark, and cannot use Spark shells.
Why don't you submit a PR that contains the proper fix? It is easier to have 
one PR that has no issue, e.g., reverting backporting etc.

On Wed, 26 Mar 2025 at 00:17, Rozov, Vlad <vro...@amazon.com.invalid> wrote:
Hi All,

I kind of understand why https://github.com/apache/spark/pull/49971 was 
reverted on the branch-4.0 to allow testing of 4.0 release. Why was it also 
reverted on the master branch? I don’t see any JIRA being open for the failure. 
AFAIK, the proper way to handle the issue in Apache project:

- revert  https://github.com/apache/spark/pull/49971 on the branch-4.0 (note 
that PR targeted master branch and SPARK-51229 targets 4.1 as affected version)
- open JIRA and provide details of how to reproduce the error on the master 
branch
- wait for the author of https://github.com/apache/spark/pull/49971 (in this 
case it was me) to reply in a timely manner and only then revert the commit

If you agree with the approach, please undo the revert and file JIRA. There is 
an easy fix to the issue.

Thank you,

Vlad









Reply via email to