Hm, are you suggesting that the Spark distribution be a bag of 100 JARs? It doesn't quite seem reasonable. It does not remove version conflicts, just pushes them to run-time, which isn't good. The assembly is also necessary because that's where shading happens. In development, you want to run against exactly what will be used in a real Spark distro.
On Tue, Sep 2, 2014 at 9:39 AM, scwf <[email protected]> wrote: > hi, all > I suggest spark not use assembly jar as default run-time > dependency(spark-submit/spark-class depend on assembly jar),use a library of > all 3rd dependency jar like hadoop/hive/hbase more reasonable. > > 1 assembly jar packaged all 3rd jars into a big one, so we need rebuild > this jar if we want to update the version of some component(such as hadoop) > 2 in our practice with spark, sometimes we meet jar compatibility issue, > it is hard to diagnose compatibility issue with assembly jar > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
