Hi,sparks, I have a spark streaming application that is a maven project, I would like to build it into a uber jar and run in the cluster. I have found out two options to build the uber jar, either of them has its shortcomings, so I would ask how you guys do it. Thanks.
1. Use the maven shade jar, and I have marked the spark related stuff as provided in the pom.xml, like: <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.10</artifactId> <version>${spark.version}</version> <scope>provided</scope> </dependency> With this, looks it can build the uber jar, but when I run the application locally, it complains that spark related stuff is missing which is not surprising because the spark related things are marked as provided, which will not included in runtime time 2. Instead of marking the spark things as provided, i configure the maven shade plugin to exclude the spark things as following, but there are still many things are there. <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <artifactSet> <excludes> <exclude>junit:junit</exclude> <exclude>log4j:log4j:jar:</exclude> <exclude>org.scala-lang:scala-library:jar:</exclude> <exclude>org.apache.spark:spark-core_2.10</exclude> <exclude>org.apache.spark:spark-sql_2.10</exclude> <exclude>org.apache.spark:spark-streaming_2.10</exclude> </excludes> </artifactSet> </configuration> Does someone ever build uber jar for the spark application, I would like to see how you do it, thanks! bit1...@163.com