t; >>> sbt with the existing Hadoop ecosystem.
> >>>
> >>> Particularly the difficulties in using Sbt + Maven together (something
> which
> >>> tends block more than just spark from adopting sbt).
> >>>
> >>> I'm more than happy to listen and see what we can do on the sbt side
> to make
> >>> this as seamless as possible for all parties.
> >>>
> >>> Thanks!
> >>>
> >>>
> >>>
> >>> --
> >>> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Necessity-of-Maven-and-SBT-Build-in-Spark-tp2315p5682.html
> >>> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
> >>
> >>
> >>
> >> --
> >> --
> >> Evan Chan
> >> Staff Engineer
> >> e...@ooyala.com |
>
>
;> sbt with the existing Hadoop ecosystem.
>>>
>>> Particularly the difficulties in using Sbt + Maven together (something which
>>> tends block more than just spark from adopting sbt).
>>>
>>> I'm more than happy to listen and see what
re than happy to listen and see what we can do on the sbt side to make
>> this as seamless as possible for all parties.
>>
>> Thanks!
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Necessity-of-Maven-and-SBT-Build-in-Spark-tp2315p5682.html
>> Sent from the Apache Spark Developers List mailing list archive at
>> Nabble.com.
>
>
>
> --
> --
> Evan Chan
> Staff Engineer
> e...@ooyala.com |
>
> Thanks!
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Necessity-of-Maven-and-SBT-Build-in-Spark-tp2315p5682.html
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
--
--
Evan Chan
Staff Engineer
e...@ooyala.com |
block more than just spark from adopting sbt).
I'm more than happy to listen and see what we can do on the sbt side to make
this as seamless as possible for all parties.
Thanks!
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Necessity-o
I think Kevin's point is somewhat different: there's no question that Sbt can
be integrated into Maven ecosystem - mostly the repositories and artifact
management, of course.
However, Sbt is a niche build tool and is unlikely to be widely supported by
engineering teams nor IT organizations. Sbt isn
Asm is such a mess. And their suggested solution being everyone should
shade it sounds pretty awful to me (not uncommon to have shaded asm 15
times in a single project). But I guess it you are right that shading is
only way to deal with it at this point...
On Mar 11, 2014 5:35 PM, "Kevin Markey" w
we have a maven corporate repository inhouse and of course we also use
maven central. sbt can handle retrieving from and publishing to maven
repositories just fine. we have maven, ant/ivy and sbt projects depending
on each others artifacts. not sure i see the issue there.
On Tue, Mar 11, 2014 at
Pardon my late entry into the fray, here, but we've just struggled
though some library conflicts that could have been avoided and whose
story shed some light on this question.
We have been integrating Spark with a number of other components. We
discovered several conflicts, most easily elimina
On Wed, Feb 26, 2014 at 09:22AM, Sean Owen wrote:
> Side point -- "provides" scope is not the same as an exclude.
> "provides" means, this artifact is used directly by this code (compile
> time), but it is not necessary to package it, since it will be
> available from a runtime container. Exclusion
With all due respect Patrick - this approach is seeking for troubles.
Proacively ;)
Cos
On Tue, Feb 25, 2014 at 04:09PM, Patrick Wendell wrote:
> What I mean is this. AFIAK the shader plug-in is primarily designed
> for creating uber jars which contain spark and all dependencies. But
> since Spar
On Tue, Feb 25, 2014 at 03:20PM, Evan Chan wrote:
> The correct way to exclude dependencies in SBT is actually to declare
> a dependency as "provided". I'm not familiar with Maven or its
Yes, I believe this would be equivalent to the maven exclusion of an
artifact's transitive deps.
Cos
> depe
We would like to cross-build Spark for Scala 2.11 and 2.10 eventually (they’re
a lot closer than 2.10 and 2.9). In Maven this might mean creating two POMs or
a special variable for the version or something.
Matei
On Mar 1, 2014, at 12:15 PM, Koert Kuipers wrote:
> does maven support cross bui
does maven support cross building for different scala versions?
we do this inhouse all the time with sbt. i know spark does not cross build
at this point, but is it guaranteed to stay that way?
On Sat, Mar 1, 2014 at 12:02 PM, Koert Kuipers wrote:
> i am still unsure what is wrong with sbt ass
i am still unsure what is wrong with sbt assembly. i would like a
real-world example of where it does not work, that i can run.
this is what i know:
1) sbt assembly works fine for version conflicts for an artifact. no
exclusion rules are needed.
2) if artifacts have the same classes inside yet a
On Sat, Mar 1, 2014 at 2:05 AM, Patrick Wendell wrote:
> Hey,
>
> Thanks everyone for chiming in on this. I wanted to summarize these
> issues a bit particularly wrt the constituents involved - does this
> seem accurate?
>
> = Spark Users =
> In general those linking against Spark should be totall
Couple of comments: 1) Whether the Spark POM is produced by SBT or Maven
shouldn't matter for those who just need to link against published
artifacts, but right now SBT and Maven do not produce equivalent POMs for
Spark -- I think 2) Incremental builds using Maven are trivially more
difficult
Hey,
Thanks everyone for chiming in on this. I wanted to summarize these
issues a bit particularly wrt the constituents involved - does this
seem accurate?
= Spark Users =
In general those linking against Spark should be totally unaffected by
the build choice. Spark will continue to publish well-
On Feb 26, 2014 11:12 PM, "Patrick Wendell" wrote:
>
> @mridul - As far as I know both Maven and Sbt use fairly similar
> processes for building the assembly/uber jar. We actually used to
> package spark with sbt and there were no specific issues we
> encountered and AFAIK sbt respects versioning
On Wed, Feb 26, 2014 at 2:11 PM, Sean Owen wrote:
> I also favor Maven. I don't the the logic is "because it's common". As
> Sandy says, it's because of the things that brings: more plugins,
> easier to consume by more developers, etc. These are, however, just
> some reasons 'for', and have to be
yes. the Build.scala file behaves like a configuration file mostly, but
because it is scala you can use the full power of a real language when
needed. also i found writing sbt plugins doable (but not easy).
On Feb 26, 2014 2:12 PM, "Sean Owen" wrote:
> I also favor Maven. I don't the the logic
Can't maven pom's include other ones? So what if we remove the
artifact specs from the main pom, have them generated by sbt make-pom,
and include the generated file in the main pom.xml?I guess, just
trying to figure out how much this would help (it seems at least it
would remove the issue of m
Yes, but the POM generated in that fashion is only sufficient for linking
with Spark, not for building Spark or serving as a basis from which to
build a customized Spark with Maven. So, starting from SparkBuild.scala
and generating a POM with make-pom, those who wish to build a customized
Spark wi
Mark,
No, I haven't tried this myself yet :-p Also I would expect that
sbt-pom-reader does not do assemblies at all because that is an
SBT plugin, so we would still need code to include sbt-assembly.
There is also the trick question of how to include the assembly stuff
into sbt-pom-reader
I also favor Maven. I don't the the logic is "because it's common". As
Sandy says, it's because of the things that brings: more plugins,
easier to consume by more developers, etc. These are, however, just
some reasons 'for', and have to be considered against the other pros
and cons.
The choice of
i dont buy the argument that we should use it because its the most common.
if all we would do is use what is most common then we should switch to
java, svn and maven
On Wed, Feb 26, 2014 at 1:38 PM, Mark Grover wrote:
> Hi Patrick,
> And, to pile on what Sandy said. In my opinion, it's definit
Evan,
Have you actually tried to build Spark using its POM file and sbt-pom-reader?
I just made a first, naive attempt, and I'm still sorting through just
what this did and didn't produce. It looks like the basic jar files are at
least very close to correct, and may be just fine, but that buildi
Hi Patrick,
And, to pile on what Sandy said. In my opinion, it's definitely more than
just a matter of convenience. My comment below applies both to distribution
builders but also people who have their own internal "distributions" (a few
examples of which we have already seen on this thread already
@patrick - It seems like my point about being able to inherit the root pom
was addressed and there's a way to handle this.
The larger point I meant to make is that Maven is by far the most common
build tool in projects that are likely to share contributors with Spark. I
personally know 10 people
@mridul - As far as I know both Maven and Sbt use fairly similar
processes for building the assembly/uber jar. We actually used to
package spark with sbt and there were no specific issues we
encountered and AFAIK sbt respects versioning of transitive
dependencies correctly. Do you have a specific b
I'd like to propose the following way to move forward, based on the
comments I've seen:
1. Aggressively clean up the giant dependency graph. One ticket I
might work on if I have time is SPARK-681 which might remove the giant
fastutil dependency (~15MB by itself).
2. Take an intermediate step
We maintain in house spark build using sbt. We have no problem using sbt
assembly. We did add a few exclude statements for transitive dependencies.
The main enemy of assemblies are jars that include stuff they shouldn't
(kryo comes to mind, I think they include logback?), new versions of jars
that
Side point -- "provides" scope is not the same as an exclude.
"provides" means, this artifact is used directly by this code (compile
time), but it is not necessary to package it, since it will be
available from a runtime container. Exclusions make an artifact, that
would otherwise be available, una
The problem is, the complete spark dependency graph is fairly large,
and there are lot of conflicting versions in there.
In particular, when we bump versions of dependencies - making managing
this messy at best.
Now, I have not looked in detail at how maven manages this - it might
just be accident
@Sandy
Yes, in sbt with multiple projects setup, you can easily set a variable in the
build.scala and reference the version number from all dependent projects .
Regarding mix of java and scala projects, in my workplace , we have both java
and scala codes. The sbt can be used to build both with
We use jarjar Ant plugin task to assemble into one fat jar.
Qiuzhuang
On Wed, Feb 26, 2014 at 11:26 AM, Evan chan wrote:
> Actually you can control exactly how sbt assembly merges or resolves
> conflicts. I believe the default settings however lead to order which
> cannot be controlled.
>
> I
Actually you can control exactly how sbt assembly merges or resolves conflicts.
I believe the default settings however lead to order which cannot be
controlled.
I do wish for a smarter fat jar plugin.
-Evan
To be free is not merely to cast off one's chains, but to live in a way that
respec
On Wed, Feb 26, 2014 at 5:31 AM, Patrick Wendell wrote:
> Evan - this is a good thing to bring up. Wrt the shader plug-in -
> right now we don't actually use it for bytecode shading - we simply
> use it for creating the uber jar with excludes (which sbt supports
> just fine via assembly).
Not re
Sandy, I believe the sbt-pom-reader plugin might work very well for
this exact use case. Otherwise, the SBT build file is just Scala
code, so it can easily read the pom XML directly if needed and parse
stuff out.
On Tue, Feb 25, 2014 at 4:36 PM, Sandy Ryza wrote:
> To perhaps restate what some
To perhaps restate what some have said, Maven is by far the most common
build tool for the Hadoop / JVM data ecosystem. While Maven is less pretty
than SBT, expertise in it is abundant. SBT requires contributors to
projects in the ecosystem to learn yet another tool. If we think of Spark
as a pr
Hi Patrick,
If you include shaded dependencies inside of the main Spark jar, such
that it would have combined classes from all dependencies, wouldn't
you end up with a sub-assembly jar? It would be dangerous in that
since it is a single unit, it would break normal packaging assumptions
that the j
What I mean is this. AFIAK the shader plug-in is primarily designed
for creating uber jars which contain spark and all dependencies. But
since Spark is something people depend on in Maven, what I actually
want is to create the normal old Spark jar [1], but then include
shaded versions of some of ou
Patrick -- not sure I understand your request, do you mean
- somehow creating a shaded jar (eg with maven shader plugin)
- then including it in the spark jar (which would then be an assembly)?
On Tue, Feb 25, 2014 at 4:01 PM, Patrick Wendell wrote:
> Evan - this is a good thing to bring up. Wrt t
Evan - this is a good thing to bring up. Wrt the shader plug-in -
right now we don't actually use it for bytecode shading - we simply
use it for creating the uber jar with excludes (which sbt supports
just fine via assembly).
I was wondering actually, do you know if it's possible to added shaded
a
Hi Patrick,
> (b) You have downloaded Spark and forked it's maven build to change around
> the dependencies.
We go with this approach. We've cloned Spark repo and currently maintain
our own branch. The idea is to fix Spark issues found in our production
system first and contribute back to commu
Hey Yao,
Would you mind explaining exactly how your company extends the Spark
maven build? For instance:
(a) You are depending on Spark in your build and your build is using Maven.
(b) You have downloaded Spark and forked it's maven build to change
around the dependencies.
(c) You are writing pom
The problem is that plugins are not equivalent. There is AFAIK no
equivalent to the maven shader plugin for SBT.
There is an SBT plugin which can apparently read POM XML files
(sbt-pom-reader). However, it can't possibly handle plugins, which
is still problematic.
On Tue, Feb 25, 2014 at 3:31 P
I would prefer keep both of them, it would be better even if that means
pom.xml will be generated using sbt. Some company, like my current one,
have their own build infrastructures built on top of maven. It is not easy
to support sbt for these potential spark clients. But I do agree to only
keep on
I am no sbt guru, but I could exclude transitive dependencies this way:
libraryDependencies +=
"log4j" % "log4j" % "1.2.15" exclude("javax.jms", "jms")
Thanks!
On Tue, Feb 25, 2014 at 3:20 PM, Evan Chan wrote:
> The correct way to exclude dependencies in SBT is actually to declare
> a depe
The correct way to exclude dependencies in SBT is actually to declare
a dependency as "provided". I'm not familiar with Maven or its
dependencySet, but provided will mark the entire dependency tree as
excluded. It is also possible to exclude jar by jar, but this is
pretty error prone and messy.
yes in sbt assembly you can exclude jars (although i never had a need for
this) and files in jars.
for example i frequently remove log4j.properties, because for whatever
reason hadoop decided to include it making it very difficult to use our own
logging config.
On Tue, Feb 25, 2014 at 4:24 PM,
On Fri, Feb 21, 2014 at 11:11AM, Patrick Wendell wrote:
> Kos - thanks for chiming in. Could you be more specific about what is
> available in maven and not in sbt for these issues? I took a look at
> the bigtop code relating to Spark. As far as I could tell [1] was the
> main point of integration
52 matches
Mail list logo