Evan - this is a good thing to bring up. Wrt the shader plug-in - right now we don't actually use it for bytecode shading - we simply use it for creating the uber jar with excludes (which sbt supports just fine via assembly).
I was wondering actually, do you know if it's possible to added shaded artifacts to the *spark jar* using this plug-in (e.g. not an uber jar)? That's something I could see being really handy in the future. - Patrick On Tue, Feb 25, 2014 at 3:39 PM, Evan Chan <e...@ooyala.com> wrote: > The problem is that plugins are not equivalent. There is AFAIK no > equivalent to the maven shader plugin for SBT. > There is an SBT plugin which can apparently read POM XML files > (sbt-pom-reader). However, it can't possibly handle plugins, which > is still problematic. > > On Tue, Feb 25, 2014 at 3:31 PM, yao <yaosheng...@gmail.com> wrote: >> I would prefer keep both of them, it would be better even if that means >> pom.xml will be generated using sbt. Some company, like my current one, >> have their own build infrastructures built on top of maven. It is not easy >> to support sbt for these potential spark clients. But I do agree to only >> keep one if there is a promising way to generate correct configuration from >> the other. >> >> -Shengzhe >> >> >> On Tue, Feb 25, 2014 at 3:20 PM, Evan Chan <e...@ooyala.com> wrote: >> >>> The correct way to exclude dependencies in SBT is actually to declare >>> a dependency as "provided". I'm not familiar with Maven or its >>> dependencySet, but provided will mark the entire dependency tree as >>> excluded. It is also possible to exclude jar by jar, but this is >>> pretty error prone and messy. >>> >>> On Tue, Feb 25, 2014 at 2:45 PM, Koert Kuipers <ko...@tresata.com> wrote: >>> > yes in sbt assembly you can exclude jars (although i never had a need for >>> > this) and files in jars. >>> > >>> > for example i frequently remove log4j.properties, because for whatever >>> > reason hadoop decided to include it making it very difficult to use our >>> own >>> > logging config. >>> > >>> > >>> > >>> > On Tue, Feb 25, 2014 at 4:24 PM, Konstantin Boudnik <c...@apache.org> >>> wrote: >>> > >>> >> On Fri, Feb 21, 2014 at 11:11AM, Patrick Wendell wrote: >>> >> > Kos - thanks for chiming in. Could you be more specific about what is >>> >> > available in maven and not in sbt for these issues? I took a look at >>> >> > the bigtop code relating to Spark. As far as I could tell [1] was the >>> >> > main point of integration with the build system (maybe there are other >>> >> > integration points)? >>> >> > >>> >> > > - in order to integrate Spark well into existing Hadoop stack it >>> was >>> >> > > necessary to have a way to avoid transitive dependencies >>> >> duplications and >>> >> > > possible conflicts. >>> >> > > >>> >> > > E.g. Maven assembly allows us to avoid adding _all_ Hadoop libs >>> >> and later >>> >> > > merely declare Spark package dependency on standard Bigtop >>> Hadoop >>> >> > > packages. And yes - Bigtop packaging means the naming and layout >>> >> would be >>> >> > > standard across all commercial Hadoop distributions that are >>> worth >>> >> > > mentioning: ASF Bigtop convenience binary packages, and >>> Cloudera or >>> >> > > Hortonworks packages. Hence, the downstream user doesn't need to >>> >> spend any >>> >> > > effort to make sure that Spark "clicks-in" properly. >>> >> > >>> >> > The sbt build also allows you to plug in a Hadoop version similar to >>> >> > the maven build. >>> >> >>> >> I am actually talking about an ability to exclude a set of dependencies >>> >> from an >>> >> assembly, similarly to what's happening in dependencySet sections of >>> >> assembly/src/main/assembly/assembly.xml >>> >> If there is a comparable functionality in Sbt, that would help quite a >>> bit, >>> >> apparently. >>> >> >>> >> Cos >>> >> >>> >> > > - Maven provides a relatively easy way to deal with the jar-hell >>> >> problem, >>> >> > > although the original maven build was just Shader'ing everything >>> >> into a >>> >> > > huge lump of class files. Oftentimes ending up with classes >>> >> slamming on >>> >> > > top of each other from different transitive dependencies. >>> >> > >>> >> > AFIAK we are only using the shade plug-in to deal with conflict >>> >> > resolution in the assembly jar. These are dealt with in sbt via the >>> >> > sbt assembly plug-in in an identical way. Is there a difference? >>> >> >>> >> I am bringing up the Sharder, because it is an awful hack, which is >>> can't >>> >> be >>> >> used in real controlled deployment. >>> >> >>> >> Cos >>> >> >>> >> > [1] >>> >> >>> https://git-wip-us.apache.org/repos/asf?p=bigtop.git;a=blob;f=bigtop-packages/src/common/spark/do-component-build;h=428540e0f6aa56cd7e78eb1c831aa7fe9496a08f;hb=master >>> >> >>> >>> >>> >>> -- >>> -- >>> Evan Chan >>> Staff Engineer >>> e...@ooyala.com | >>> > > > > -- > -- > Evan Chan > Staff Engineer > e...@ooyala.com |