We solved this by adding to spark-class script. At the bottom before the exec
statement we intercepted the command that was constructed and injected our
additional class path :
for ((i=0; i<${#CMD[@]}; i++));
do
if [[ ${CMD[$i]} == *"$SPARK_ASSEMBLY_JAR"* ]]
then
CMD[$i]="${CMD[$i]}:/usr/lib/hadoop/*.jar:/usr/share/aws/aws-java-sdk/aws-java-sdk-cloudwatch-1.10.4.jar:/usr/share/aws/emr/emrfs/lib/*"
fi
done
exec "${CMD[@]}"
M
> On Nov 18, 2015, at 1:19 AM, "[email protected]" <[email protected]>
> wrote:
>
> Have you tried using
> spark.driver.extraClassPath
> and
> spark.executor.extraClassPath
>
> ?
>
> AFAICT these config options replace SPARK_CLASSPATH. Further info in the
> docs. I've had good luck with these options, and for ease of use I just set
> them in the spark defaults config.
>
> https://spark.apache.org/docs/latest/configuration.html
>
>> On Tue, 17 Nov 2015 at 21:06 Michal Klos <[email protected]> wrote:
>> Hi,
>>
>> We are running a Spark Standalone cluster on EMR (note: not using YARN) and
>> are trying to use S3 w/ EmrFS as our event logging directory.
>>
>> We are having difficulties with a ClassNotFoundException on EmrFileSystem
>> when we navigate to the event log screen. This is to be expected as the
>> EmrFs jars are not on the classpath.
>>
>> But -- I have not been able to figure out a way to add additional classpath
>> jars to the start-up of the Master daemon. SPARK_CLASSPATH has been
>> deprecated, and looking around at spark-class, etc.. everything seems to be
>> pretty locked down.
>>
>> Do I have to shove everything into the assembly jar?
>>
>> Am I missing a simple way to add classpath to the daemons?
>>
>> thanks,
>> Michal