Hi Hyukjin, Can you clarify what is broken in SBT? As I already mentioned in my previous e-mail, I tested both maven and SBT builds and both works. Is there a test that I can run that shows what is broken? Please file JIRA, so we can communicate on the JIRA and document what is broken.
Thank you, Vlad On Mar 26, 2025, at 3:18 PM, Hyukjin Kwon <gurwls...@apache.org> wrote: That only fixes Maven. Both SBT build and Maven build should work in the same or similar wat. Let's make sure both work. On Thu, Mar 27, 2025 at 3:18 AM Rozov, Vlad <vro...@amazon.com.invalid> wrote: Please see https://github.com/vrozov/spark/tree/spark-shell. I tested only spark-shell —remote local after building with maven and sbt. It may not be a complete fix and there is no PR. I’ll look into SBT build issue (assuming that there is still one after the fix) once you file JIRA. Thank you, Vlad On Mar 26, 2025, at 8:27 AM, Hyukjin Kwon <gurwls...@apache.org<mailto:gurwls...@apache.org>> wrote: Vlad, - Please show me if there is a simple fix. If that's the case, yes, I will revert this out from the master branch. That works for me. - If not, let's make a new PR. - If you feel this is an issue, let's start a vote. Let me know. On Thu, 27 Mar 2025 at 00:13, Rozov, Vlad <vro...@amazon.com.invalid> wrote: Every graduated from incubating Apache project has guards against what you name “chaotic” and what other name breaking best development practices. Such guards include JIRA, unit tests and PR review. Instead of reverting commit, I would expect you to open JIRA and outline what is broken. If you filed JIRA, please provide the link, if not, please open one. It took me 2 hours to fix Spark shells, so should you open JIRA instead of spending time to identify the commit and reverting it, you will save time as well. I’ll post fix once JIRA is open and I validate that my understanding of what is broken is the same as yours. Thank you, Vlad On Mar 26, 2025, at 12:01 AM, Hyukjin Kwon <gurwls...@apache.org<mailto:gurwls...@apache.org>> wrote: Rozov, please test the patch, see if there is a relevant test or not, and add a test if not there. If it is difficult to add a test, describe it in the PR description, and how you manually tested. This is what I think you need to do instead of reverting the revert. Imagine that there are many of such PRs living in the master branch. Thankfully this time you are alone but imagine many people do the same things - it will be chaotic. To me, it took 4 hours to identify this single commit. About the change itself, you need to fix SBT together, and it will need the changes such as https://github.com/apache/spark/commit/e927a7edad47f449aeb0d5014b6185ac36b344d0. Should also test Spark shells, and describe how you tested it as well. This is what I expect: - Please show me if there is a simple fix. If that's the case, yes, I will revert this out from the master branch. That works for me. - If not, let's make a new PR. - If you feel this is an issue, let's start a vote. Let me know. On Wed, 26 Mar 2025 at 15:07, Reynold Xin <r...@databricks.com.invalid> wrote: Sorry Vlad - I disagree. Where is the simple fix? As a new contributor, you should not be coming in guns blazing blaming committers who are trying to keep the master branch sane and clean. On Tue, Mar 25, 2025 at 10:53 PM Rozov, Vlad <vro...@amazon.com.invalid> wrote: There is a simple fix. This is exactly what I outlined in the e-mail. Prior to reverting commit (on master) it was necessary to see if an easy fix exists. The PR that introduced the error was merged into master 3 weeks ago, so I still don’t get why it was reverted overnight. It was also necessary to open JIRA and provide reproducible steps. I assume that steps on the PR is what needs to be fixed, but it will be better to avoid guessing. Thank you, Vlad On Mar 25, 2025, at 10:05 PM, Reynold Xin <r...@databricks.com.INVALID> wrote: Is there a fix already available or a very simple fix a committer can create quickly? If yes, we can merge the fix. If there isn't, for major functionality breaking change, we should just revert. That's fairly basic software engineering practices. On Tue, Mar 25, 2025 at 9:53 PM Hyukjin Kwon <gurwls...@apache.org<mailto:gurwls...@apache.org>> wrote: With the change, the main entry points, Spark shalls, don't work and developers cannot debug and test. The snapshots become uesless. The tests passed because you did not fix SBT. It needs a larger change. Such change cannot be in the source. I can start a vote if you think this is an issue. On Wed, Mar 26, 2025 at 12:32 PM Rozov, Vlad <vro...@amazon.com.invalid> wrote: This does not make any sense. 1. There are no broken tests introduced by https://github.com/apache/spark/pull/49971 2. There are no JIRA filed for “the main entry point” 3. “The main entry point” that does not have any unit test suggests that it is not the main entry point. 4. It is not practical to revert every commit that introduced some functional issue or regression. If regression can’t be fixed, it is reverted, if it can be reasonably fixed, the fix is applied to the regression. Just as an example, https://github.com/apache/spark/pull/42823 introduced a regression. It was fixed by https://github.com/apache/spark/pull/50348. Nobody asked for the revert. 5. The way how revert was handled is not "Apache way”. It may be easier for you to handle it in a single PR, but it is easier for me to provide PR on top of existing changes and it will also provide JIRA and a commit that documents why there is an issue. Otherwise it will be easier to introduce the error back again. Thank you, Vlad On Mar 25, 2025, at 4:58 PM, Hyukjin Kwon <gurwls...@apache.org<mailto:gurwls...@apache.org>> wrote: Rozov, this broke the main entry points of release, Spark shells. Even in the mast branch, you build a Spark, and cannot use Spark shells. Why don't you submit a PR that contains the proper fix? It is easier to have one PR that has no issue, e.g., reverting backporting etc. On Wed, 26 Mar 2025 at 00:17, Rozov, Vlad <vro...@amazon.com.invalid> wrote: Hi All, I kind of understand why https://github.com/apache/spark/pull/49971 was reverted on the branch-4.0 to allow testing of 4.0 release. Why was it also reverted on the master branch? I don’t see any JIRA being open for the failure. AFAIK, the proper way to handle the issue in Apache project: - revert https://github.com/apache/spark/pull/49971 on the branch-4.0 (note that PR targeted master branch and SPARK-51229 targets 4.1 as affected version) - open JIRA and provide details of how to reproduce the error on the master branch - wait for the author of https://github.com/apache/spark/pull/49971 (in this case it was me) to reply in a timely manner and only then revert the commit If you agree with the approach, please undo the revert and file JIRA. There is an easy fix to the issue. Thank you, Vlad