Just following up on this issue. I discovered that when I ran the
application in a YARN cluster (on AWS EMR), I was able to use the AWS SDK
without issue (without the 'spark.files.userClassPath' flag set). Also, I
learned that the entire 'child-first' classloader setup was changed in
Spark 1.3.0 (released recently). The relevant configuration flag changed
names to 'spark.executor.userClassPathFirst', and it does work the way I
expected it (unlike in v1.2.0).



On Thu, Mar 12, 2015 at 2:50 PM, Adam Lewandowski <
adam.lewandow...@gmail.com> wrote:

> I'm trying to use the AWS SDK (v1.9.23) to connect to DynamoDB from within
> a Spark application. Spark 1.2.1 is assembled with HttpClient 4.2.6, but
> the AWS SDK is depending on HttpClient 4.3.4 for it's communication with
> DynamoDB. The end result is an error when the app tries to connect to
> DynamoDB and gets Spark's version instead:
> java.lang.NoClassDefFoundError: org/apache/http/client/methods/HttpPatch
> at com.amazonaws.http.AmazonHttpClient.<clinit>(AmazonHttpClient.java:129)
> at
> com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:120)
> at
> com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.<init>(AmazonDynamoDBClient.java:359)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.http.client.methods.HttpPatch
>
> Including HttpClient 4.3.4 as user jars doesn't improve the situation
> much:
> java.lang.NoSuchMethodError:
> org.apache.http.params.HttpConnectionParams.setSoKeepalive(Lorg/apache/http/params/HttpParams;Z)V
> at
> com.amazonaws.http.HttpClientFactory.createHttpClient(HttpClientFactory.java:95)
>
> I've seen the documenation regarding the 'spark.files.userClassPathFirst'
> flag and have tried to use it thinking it would resolve this issue.
> However, when that flag is used I get an NoClassDefFoundError on
> 'scala.Serializable':
> java.lang.NoClassDefFoundError: scala/Serializable
> ...
> at
> org.apache.spark.executor.ChildExecutorURLClassLoader$userClassLoader$.findClass(ExecutorURLClassLoader.scala:46)
> ...
> Caused by: java.lang.ClassNotFoundException: scala.Serializable
>
> This seems odd to me, since scala.Serializable is included in the spark
> assembly. I thought perhaps my app was compiled against a different scala
> version than spark uses, but eliminated that possibility by using the scala
> compiler directly out of the spark assembly jar with identical results.
>
> Has anyone else seen this issue, had any success with the
> "spark.files.userClassPathFirst" flag, or been able to use the AWS SDK?
> I was going to submit this a Spark JIRA issue, but thought I would check
> here first.
>
> Thanks,
> Adam Lewandowski
>
>
>

Reply via email to