On Wed, Feb 26, 2014 at 5:31 AM, Patrick Wendell <pwend...@gmail.com> wrote: > Evan - this is a good thing to bring up. Wrt the shader plug-in - > right now we don't actually use it for bytecode shading - we simply > use it for creating the uber jar with excludes (which sbt supports > just fine via assembly).
Not really - as I mentioned initially in this thread, sbt's assembly does not take dependencies into account properly : and can overwrite newer classes with older versions. >From an assembly point of view, sbt is not very good : we are yet to try it after 2.10 shift though (and probably wont, given the mess it created last time). Regards, Mridul > > I was wondering actually, do you know if it's possible to added shaded > artifacts to the *spark jar* using this plug-in (e.g. not an uber > jar)? That's something I could see being really handy in the future. > > - Patrick > > On Tue, Feb 25, 2014 at 3:39 PM, Evan Chan <e...@ooyala.com> wrote: >> The problem is that plugins are not equivalent. There is AFAIK no >> equivalent to the maven shader plugin for SBT. >> There is an SBT plugin which can apparently read POM XML files >> (sbt-pom-reader). However, it can't possibly handle plugins, which >> is still problematic. >> >> On Tue, Feb 25, 2014 at 3:31 PM, yao <yaosheng...@gmail.com> wrote: >>> I would prefer keep both of them, it would be better even if that means >>> pom.xml will be generated using sbt. Some company, like my current one, >>> have their own build infrastructures built on top of maven. It is not easy >>> to support sbt for these potential spark clients. But I do agree to only >>> keep one if there is a promising way to generate correct configuration from >>> the other. >>> >>> -Shengzhe >>> >>> >>> On Tue, Feb 25, 2014 at 3:20 PM, Evan Chan <e...@ooyala.com> wrote: >>> >>>> The correct way to exclude dependencies in SBT is actually to declare >>>> a dependency as "provided". I'm not familiar with Maven or its >>>> dependencySet, but provided will mark the entire dependency tree as >>>> excluded. It is also possible to exclude jar by jar, but this is >>>> pretty error prone and messy. >>>> >>>> On Tue, Feb 25, 2014 at 2:45 PM, Koert Kuipers <ko...@tresata.com> wrote: >>>> > yes in sbt assembly you can exclude jars (although i never had a need for >>>> > this) and files in jars. >>>> > >>>> > for example i frequently remove log4j.properties, because for whatever >>>> > reason hadoop decided to include it making it very difficult to use our >>>> own >>>> > logging config. >>>> > >>>> > >>>> > >>>> > On Tue, Feb 25, 2014 at 4:24 PM, Konstantin Boudnik <c...@apache.org> >>>> wrote: >>>> > >>>> >> On Fri, Feb 21, 2014 at 11:11AM, Patrick Wendell wrote: >>>> >> > Kos - thanks for chiming in. Could you be more specific about what is >>>> >> > available in maven and not in sbt for these issues? I took a look at >>>> >> > the bigtop code relating to Spark. As far as I could tell [1] was the >>>> >> > main point of integration with the build system (maybe there are other >>>> >> > integration points)? >>>> >> > >>>> >> > > - in order to integrate Spark well into existing Hadoop stack it >>>> was >>>> >> > > necessary to have a way to avoid transitive dependencies >>>> >> duplications and >>>> >> > > possible conflicts. >>>> >> > > >>>> >> > > E.g. Maven assembly allows us to avoid adding _all_ Hadoop libs >>>> >> and later >>>> >> > > merely declare Spark package dependency on standard Bigtop >>>> Hadoop >>>> >> > > packages. And yes - Bigtop packaging means the naming and layout >>>> >> would be >>>> >> > > standard across all commercial Hadoop distributions that are >>>> worth >>>> >> > > mentioning: ASF Bigtop convenience binary packages, and >>>> Cloudera or >>>> >> > > Hortonworks packages. Hence, the downstream user doesn't need to >>>> >> spend any >>>> >> > > effort to make sure that Spark "clicks-in" properly. >>>> >> > >>>> >> > The sbt build also allows you to plug in a Hadoop version similar to >>>> >> > the maven build. >>>> >> >>>> >> I am actually talking about an ability to exclude a set of dependencies >>>> >> from an >>>> >> assembly, similarly to what's happening in dependencySet sections of >>>> >> assembly/src/main/assembly/assembly.xml >>>> >> If there is a comparable functionality in Sbt, that would help quite a >>>> bit, >>>> >> apparently. >>>> >> >>>> >> Cos >>>> >> >>>> >> > > - Maven provides a relatively easy way to deal with the jar-hell >>>> >> problem, >>>> >> > > although the original maven build was just Shader'ing everything >>>> >> into a >>>> >> > > huge lump of class files. Oftentimes ending up with classes >>>> >> slamming on >>>> >> > > top of each other from different transitive dependencies. >>>> >> > >>>> >> > AFIAK we are only using the shade plug-in to deal with conflict >>>> >> > resolution in the assembly jar. These are dealt with in sbt via the >>>> >> > sbt assembly plug-in in an identical way. Is there a difference? >>>> >> >>>> >> I am bringing up the Sharder, because it is an awful hack, which is >>>> can't >>>> >> be >>>> >> used in real controlled deployment. >>>> >> >>>> >> Cos >>>> >> >>>> >> > [1] >>>> >> >>>> https://git-wip-us.apache.org/repos/asf?p=bigtop.git;a=blob;f=bigtop-packages/src/common/spark/do-component-build;h=428540e0f6aa56cd7e78eb1c831aa7fe9496a08f;hb=master >>>> >> >>>> >>>> >>>> >>>> -- >>>> -- >>>> Evan Chan >>>> Staff Engineer >>>> e...@ooyala.com | >>>> >> >> >> >> -- >> -- >> Evan Chan >> Staff Engineer >> e...@ooyala.com |