Hi,sparks,

I have a spark streaming application that is a maven project, I would like to 
build it into a uber jar and run in the cluster.
I have found out two options to build the uber jar, either of them has its 
shortcomings, so I would ask how you guys do it.
Thanks.

1. Use the maven shade jar, and I have marked the spark related stuff as 
provided in the pom.xml, like:
        <dependency> 
<groupId>org.apache.spark</groupId> 
<artifactId>spark-core_2.10</artifactId> 
<version>${spark.version}</version> 
<scope>provided</scope>
</dependency>

With this, looks it can build the uber jar, but when I run the application 
locally, it complains that spark related stuff is missing which is not 
surprising because the spark related things are marked as provided, which will 
not included in runtime time

2. Instead of marking the spark things as provided, i configure the maven shade 
plugin to exclude the spark things as following, but there are still many 
things are there.

<executions> 
<execution> 
<phase>package</phase> 
<goals> 
<goal>shade</goal> 
</goals> 
<configuration> 
<artifactSet> 
<excludes> 
<exclude>junit:junit</exclude> 
<exclude>log4j:log4j:jar:</exclude> 
<exclude>org.scala-lang:scala-library:jar:</exclude> 
<exclude>org.apache.spark:spark-core_2.10</exclude> 
<exclude>org.apache.spark:spark-sql_2.10</exclude> 
<exclude>org.apache.spark:spark-streaming_2.10</exclude> 
</excludes> 
</artifactSet> 
</configuration>


Does someone ever build uber jar for the spark application, I would like to see 
how you do it, thanks!















bit1...@163.com

Reply via email to