Just following up on this issue. I discovered that when I ran the application in a YARN cluster (on AWS EMR), I was able to use the AWS SDK without issue (without the 'spark.files.userClassPath' flag set). Also, I learned that the entire 'child-first' classloader setup was changed in Spark 1.3.0 (released recently). The relevant configuration flag changed names to 'spark.executor.userClassPathFirst', and it does work the way I expected it (unlike in v1.2.0).
On Thu, Mar 12, 2015 at 2:50 PM, Adam Lewandowski < adam.lewandow...@gmail.com> wrote: > I'm trying to use the AWS SDK (v1.9.23) to connect to DynamoDB from within > a Spark application. Spark 1.2.1 is assembled with HttpClient 4.2.6, but > the AWS SDK is depending on HttpClient 4.3.4 for it's communication with > DynamoDB. The end result is an error when the app tries to connect to > DynamoDB and gets Spark's version instead: > java.lang.NoClassDefFoundError: org/apache/http/client/methods/HttpPatch > at com.amazonaws.http.AmazonHttpClient.<clinit>(AmazonHttpClient.java:129) > at > com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:120) > at > com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.<init>(AmazonDynamoDBClient.java:359) > Caused by: java.lang.ClassNotFoundException: > org.apache.http.client.methods.HttpPatch > > Including HttpClient 4.3.4 as user jars doesn't improve the situation > much: > java.lang.NoSuchMethodError: > org.apache.http.params.HttpConnectionParams.setSoKeepalive(Lorg/apache/http/params/HttpParams;Z)V > at > com.amazonaws.http.HttpClientFactory.createHttpClient(HttpClientFactory.java:95) > > I've seen the documenation regarding the 'spark.files.userClassPathFirst' > flag and have tried to use it thinking it would resolve this issue. > However, when that flag is used I get an NoClassDefFoundError on > 'scala.Serializable': > java.lang.NoClassDefFoundError: scala/Serializable > ... > at > org.apache.spark.executor.ChildExecutorURLClassLoader$userClassLoader$.findClass(ExecutorURLClassLoader.scala:46) > ... > Caused by: java.lang.ClassNotFoundException: scala.Serializable > > This seems odd to me, since scala.Serializable is included in the spark > assembly. I thought perhaps my app was compiled against a different scala > version than spark uses, but eliminated that possibility by using the scala > compiler directly out of the spark assembly jar with identical results. > > Has anyone else seen this issue, had any success with the > "spark.files.userClassPathFirst" flag, or been able to use the AWS SDK? > I was going to submit this a Spark JIRA issue, but thought I would check > here first. > > Thanks, > Adam Lewandowski > > >