Here's a bit of history and context:
The project was initially built using SBT (
https://github.com/apache/spark/commit/df29d0ea4c8b7137fdd1844219c7d489e3b0d9c9
).
Later, Maven support was added (
https://github.com/apache/spark/commit/811a32257b1b59b042a2871eede6ee39d9e8a137
)
to provide an alter
A slightly off-topic but related question: It feels fragile to test with
SBT while publishing the release with Maven. How did we end up in this
situation? Moreover, since most Spark developers use SBT for their daily
work, it becomes even harder to catch issues with the Maven build.
On Thu, Mar 27
Nah, I wasn't clear.
Maven and SBT builds are synced for this special code path, e.g.,
https://github.com/apache/spark/commit/e927a7edad47f449aeb0d5014b6185ac36b344d0
.
If you build Maven and SBT, the results are almost the same.
Now, the fix you landed in Maven (and indeed it was a Maven specifi
If it is not broken, can the sync between maven and SBT dependencies/shadow be
done in a follow up PR?
Thank you,
Vlad
On Mar 26, 2025, at 5:44 PM, Hyukjin Kwon wrote:
It is not broken. The fix you applied would not be applied in SBT. For example,
the lines you changed (added in
https://git
Sorry, but I still don’t follow. My PR broke Maven and the fix I provided fixes
Maven. SBT was never broken except there is inconsistency between SBT and Maven
builds. Can the inconsistency be fixed in a follow up PR?
Thank you,
Vlad
On Mar 26, 2025, at 5:57 PM, Hyukjin Kwon wrote:
It is not
It is not broken ... because we run SBT in PR builders for ASF resource
restrictions and faster build. We use Maven for release so it was found out
now.
CI did not test your change. The part you are fixing is a special path ..
On Thu, Mar 27, 2025 at 9:53 AM Rozov, Vlad
wrote:
> If it is not br
It is not broken. The fix you applied would not be applied in SBT. For
example, the lines you changed (added in
https://github.com/apache/spark/commit/e927a7edad47f449aeb0d5014b6185ac36b344d0
):
diff```
-
- com.google.common
-
${spark.shade.packageName}.connect.guava
-
+1 on explanation that it is not happening only to Vlad but always
happening as a normal process.
Vlad, if we are very strict about ASF voting policy, we have to have
three +1s without -1 to merge the code change. I don't think the major
projects in ASF follow it - instead, they (including Spark)
That only fixes Maven. Both SBT build and Maven build should work in the
same or similar wat. Let's make sure both work.
On Thu, Mar 27, 2025 at 3:18 AM Rozov, Vlad
wrote:
> Please see https://github.com/vrozov/spark/tree/spark-shell. I tested
> only spark-shell —remote local after building with
Every graduated from incubating Apache project has guards against what you name
“chaotic” and what other name breaking best development practices. Such guards
include JIRA, unit tests and PR review. Instead of reverting commit, I would
expect you to open JIRA and outline what is broken. If you f
Please see https://github.com/vrozov/spark/tree/spark-shell. I tested only
spark-shell —remote local after building with maven and sbt. It may not be a
complete fix and there is no PR. I’ll look into SBT build issue (assuming that
there is still one after the fix) once you file JIRA.
Thank you,
On Thu, 27 Mar 2025 at 00:13, Rozov, Vlad wrote:
> Every graduated from incubating Apache project has guards against what you
> name “chaotic” and what other name breaking best development practices. Such
> guards include JIRA, unit tests and PR review. Instead of reverting commit, I
> would ex
Hello Team,
I was working with Spark 3.2 and Hadoop 2.7.6 and writing to MinIO object
storage . It was slower when compared to writing to MapR FS with the above
tech stack. Then moved on to a later upgraded version of Spark 3.5.2 and
Hadoop 4.3.1 which started writing to MinIO with V2 fileoutputcom
Vlad,
- Please show me if there is a simple fix. If that's the case, yes, I will
revert this out from the master branch. That works for me.
- If not, let's make a new PR.
- If you feel this is an issue, let's start a vote. Let me know.
On Thu, 27 Mar 2025 at 00:13, Rozov, Vlad wrote:
> Every g
My advice to you Vlad is that it would be more fruitful to focus on fixing
the issue than being extremely dogmatic and wasting everybody’s energy
arguing about this.
Of course, you are welcome to form your own opinion.
On Wed, Mar 26, 2025 at 7:38 AM Rozov, Vlad
wrote:
> Reynold, I am not sure
This is what my WIP PR targets. It will help to identify any compatibility or
breaking issues with the new dependency.
Thank you,
Vlad
On Mar 26, 2025, at 3:14 AM, Mich Talebzadeh wrote:
Because of dependencies we need to ensure that the underlying artifacts (Hive
4.0.1) is also stable enoug
Reynold, I am not sure I follow your question. I’ll open PR with the fix once
JIRA is open.
While I am new to the Spark community, I am not new to the Apache projects and
open source. Committers are guardians for commits and they keep not only master
branch, but the entire source code in shape
Because of dependencies we need to ensure that the underlying artifacts
(Hive 4.0.1) is also stable enough. We should aim to establish that first
and look for release timelines and where it fits
cheers
Dr Mich Talebzadeh,
Architect | Data Science | Financial Crime | Forensic Analysis | GDPR
v
Rozov, please test the patch, see if there is a relevant test or not, and
add a test if not there. If it is difficult to add a test, describe it in
the PR description, and how you manually tested.
This is what I think you need to do instead of reverting the revert.
Imagine that there are many of su
19 matches
Mail list logo