Pardon my late entry into the fray, here, but we've just struggled
though some library conflicts that could have been avoided and whose
story shed some light on this question.
We have been integrating Spark with a number of other components. We
discovered several conflicts, most easily eliminated. But the ASM
conflicts were not quite so easy to handle because of ASM's API changes
between 3.x and 4.x (most usually seen first in ClassVisitor which was
an interface and now is an abstract class).
The spark-core_2.10 has a transitive dependency on 4.0. Hive, Hadoop,
various Java EE servlets, and other libraries have transitive
dependencies on 3.2 or earlier. In one of the applications we are
developing, there are 10 libraries with ASM dependencies. Five are
well-behaved, having shaded ASM. Another five, are poorly behaved, not
shading it. The ASM FAQ specifically recommends shading ASM in any tool
or framework which contains it: http://asm.ow2.org/doc/faq.html#Q15.
ASM has been shaded in the SBT build since June 2013. However, it was
not properly shaded in the Maven build until last week. As result,
libraries such as spark-core_2.10 pushed to Maven Central haven't
reflected the SBT build. This is documented in Jira SPARK-782:
https://spark-project.atlassian.net/browse/SPARK-782
We cannot use SBT for our overall project. Maven is our standard.
Hence, we are dependent on Maven Central and libraries mirrored by our
corporate repository.
In this context, if both builds are maintained, then they need to have
the same functionality.
If only one build must be retained, it should be Maven because Maven and
other tools that use Maven Central are more likely to be used for large
project integrations. Also for this reason, the Maven build should be
given more priority than at present. It seems a bit odd, if a Maven
project can be automatically generated from SBT, that it would take 1
year for ASM shading in Maven to catch up with SBT.
Thanks
Kevin Markey
SBT appears to have syntax for both, just like Maven. Surely these
have the same meanings in SBT, and excluding artifacts is accomplished
with exclude and excludeAll, as seen in the Spark build?
The assembly and shader stuff in Maven is more about controlling
exactly how it's put together into an artifact, at the level of files
even, to stick a license file in or exclude some data file cruft or
rename dependencies.
exclusions and shading are necessary evils to be used as sparingly as
possible. Dependency graphs get nuts fast here, and Spark is already
quite big. (Hence my recent PR to start touching it up -- more coming
for sure.)