We use jarjar Ant plugin task to assemble into one fat jar. Qiuzhuang
On Wed, Feb 26, 2014 at 11:26 AM, Evan chan <e...@ooyala.com> wrote: > Actually you can control exactly how sbt assembly merges or resolves > conflicts. I believe the default settings however lead to order which > cannot be controlled. > > I do wish for a smarter fat jar plugin. > > -Evan > To be free is not merely to cast off one's chains, but to live in a way > that respects & enhances the freedom of others. (#NelsonMandela) > > > On Feb 25, 2014, at 6:50 PM, Mridul Muralidharan <mri...@gmail.com> > wrote: > > > >> On Wed, Feb 26, 2014 at 5:31 AM, Patrick Wendell <pwend...@gmail.com> > wrote: > >> Evan - this is a good thing to bring up. Wrt the shader plug-in - > >> right now we don't actually use it for bytecode shading - we simply > >> use it for creating the uber jar with excludes (which sbt supports > >> just fine via assembly). > > > > > > Not really - as I mentioned initially in this thread, sbt's assembly > > does not take dependencies into account properly : and can overwrite > > newer classes with older versions. > > From an assembly point of view, sbt is not very good : we are yet to > > try it after 2.10 shift though (and probably wont, given the mess it > > created last time). > > > > Regards, > > Mridul > > > > > > > > > > > >> > >> I was wondering actually, do you know if it's possible to added shaded > >> artifacts to the *spark jar* using this plug-in (e.g. not an uber > >> jar)? That's something I could see being really handy in the future. > >> > >> - Patrick > >> > >>> On Tue, Feb 25, 2014 at 3:39 PM, Evan Chan <e...@ooyala.com> wrote: > >>> The problem is that plugins are not equivalent. There is AFAIK no > >>> equivalent to the maven shader plugin for SBT. > >>> There is an SBT plugin which can apparently read POM XML files > >>> (sbt-pom-reader). However, it can't possibly handle plugins, which > >>> is still problematic. > >>> > >>>> On Tue, Feb 25, 2014 at 3:31 PM, yao <yaosheng...@gmail.com> wrote: > >>>> I would prefer keep both of them, it would be better even if that > means > >>>> pom.xml will be generated using sbt. Some company, like my current > one, > >>>> have their own build infrastructures built on top of maven. It is not > easy > >>>> to support sbt for these potential spark clients. But I do agree to > only > >>>> keep one if there is a promising way to generate correct > configuration from > >>>> the other. > >>>> > >>>> -Shengzhe > >>>> > >>>> > >>>>> On Tue, Feb 25, 2014 at 3:20 PM, Evan Chan <e...@ooyala.com> wrote: > >>>>> > >>>>> The correct way to exclude dependencies in SBT is actually to declare > >>>>> a dependency as "provided". I'm not familiar with Maven or its > >>>>> dependencySet, but provided will mark the entire dependency tree as > >>>>> excluded. It is also possible to exclude jar by jar, but this is > >>>>> pretty error prone and messy. > >>>>> > >>>>>> On Tue, Feb 25, 2014 at 2:45 PM, Koert Kuipers <ko...@tresata.com> > wrote: > >>>>>> yes in sbt assembly you can exclude jars (although i never had a > need for > >>>>>> this) and files in jars. > >>>>>> > >>>>>> for example i frequently remove log4j.properties, because for > whatever > >>>>>> reason hadoop decided to include it making it very difficult to use > our > >>>>> own > >>>>>> logging config. > >>>>>> > >>>>>> > >>>>>> > >>>>>>> On Tue, Feb 25, 2014 at 4:24 PM, Konstantin Boudnik < > c...@apache.org> > >>>>>> wrote: > >>>>>> > >>>>>>>> On Fri, Feb 21, 2014 at 11:11AM, Patrick Wendell wrote: > >>>>>>>> Kos - thanks for chiming in. Could you be more specific about > what is > >>>>>>>> available in maven and not in sbt for these issues? I took a look > at > >>>>>>>> the bigtop code relating to Spark. As far as I could tell [1] was > the > >>>>>>>> main point of integration with the build system (maybe there are > other > >>>>>>>> integration points)? > >>>>>>>> > >>>>>>>>> - in order to integrate Spark well into existing Hadoop stack it > >>>>> was > >>>>>>>>> necessary to have a way to avoid transitive dependencies > >>>>>>> duplications and > >>>>>>>>> possible conflicts. > >>>>>>>>> > >>>>>>>>> E.g. Maven assembly allows us to avoid adding _all_ Hadoop > libs > >>>>>>> and later > >>>>>>>>> merely declare Spark package dependency on standard Bigtop > >>>>> Hadoop > >>>>>>>>> packages. And yes - Bigtop packaging means the naming and > layout > >>>>>>> would be > >>>>>>>>> standard across all commercial Hadoop distributions that are > >>>>> worth > >>>>>>>>> mentioning: ASF Bigtop convenience binary packages, and > >>>>> Cloudera or > >>>>>>>>> Hortonworks packages. Hence, the downstream user doesn't need > to > >>>>>>> spend any > >>>>>>>>> effort to make sure that Spark "clicks-in" properly. > >>>>>>>> > >>>>>>>> The sbt build also allows you to plug in a Hadoop version similar > to > >>>>>>>> the maven build. > >>>>>>> > >>>>>>> I am actually talking about an ability to exclude a set of > dependencies > >>>>>>> from an > >>>>>>> assembly, similarly to what's happening in dependencySet sections > of > >>>>>>> assembly/src/main/assembly/assembly.xml > >>>>>>> If there is a comparable functionality in Sbt, that would help > quite a > >>>>> bit, > >>>>>>> apparently. > >>>>>>> > >>>>>>> Cos > >>>>>>> > >>>>>>>>> - Maven provides a relatively easy way to deal with the jar-hell > >>>>>>> problem, > >>>>>>>>> although the original maven build was just Shader'ing > everything > >>>>>>> into a > >>>>>>>>> huge lump of class files. Oftentimes ending up with classes > >>>>>>> slamming on > >>>>>>>>> top of each other from different transitive dependencies. > >>>>>>>> > >>>>>>>> AFIAK we are only using the shade plug-in to deal with conflict > >>>>>>>> resolution in the assembly jar. These are dealt with in sbt via > the > >>>>>>>> sbt assembly plug-in in an identical way. Is there a difference? > >>>>>>> > >>>>>>> I am bringing up the Sharder, because it is an awful hack, which is > >>>>> can't > >>>>>>> be > >>>>>>> used in real controlled deployment. > >>>>>>> > >>>>>>> Cos > >>>>>>> > >>>>>>>> [1] > >>>>> > https://git-wip-us.apache.org/repos/asf?p=bigtop.git;a=blob;f=bigtop-packages/src/common/spark/do-component-build;h=428540e0f6aa56cd7e78eb1c831aa7fe9496a08f;hb=master > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> -- > >>>>> Evan Chan > >>>>> Staff Engineer > >>>>> e...@ooyala.com | > >>> > >>> > >>> > >>> -- > >>> -- > >>> Evan Chan > >>> Staff Engineer > >>> e...@ooyala.com | >