(For the benefit of other users)

The workaround appears to be building spark for the exact Hadoop version
and building the app with spark as a provided dependency + without the
hadoop-client as a direct dependency of the app. With that, hdfs access
works just fine.


On Fri, Jul 25, 2014 at 11:50 PM, Bharath Ravi Kumar <reachb...@gmail.com>
wrote:

> That's right, I'm looking to depend on spark in general and change only
> the hadoop client deps. The spark master and slaves use the
> spark-1.0.1-bin-hadoop1 binaries from the downloads page.  The relevant
> snippet from the app's maven pom is as follows:
>
>         <dependency>
>             <groupId>org.apache.spark</groupId>
>             <artifactId>spark-core_2.10</artifactId>
>             <version>1.0.1</version>
>             <scope>provided</scope>
>         </dependency>
>         <dependency>
>           <groupId>org.apache.hadoop</groupId>
>           <artifactId>hadoop-client</artifactId>
>           <version>0.20.2-cdh3u5</version>
>           <type>jar</type>
>         </dependency>
>     </dependencies>
>
>     <repositories>
>         <repository>
>             <id>Cloudera repository</id>
>             <url>
> https://repository.cloudera.com/artifactory/cloudera-repos/</url>
>         </repository>
>         <repository>
>             <id>Akka repository</id>
>             <url>http://repo.akka.io/releases</url>
>         </repository>
>     </repositories>
>
>
> Thanks,
> Bharath
>
>
> On Fri, Jul 25, 2014 at 10:29 PM, Sean Owen <so...@cloudera.com> wrote:
>
>> If you link against the pre-built binary, that's for Hadoop 1.0.4. Can
>> you show your deps to clarify what you are depending on? Building
>> custom Spark and depending on it is a different thing from depending
>> on plain Spark and changing its deps. I think you want the latter.
>>
>> On Fri, Jul 25, 2014 at 5:46 PM, Bharath Ravi Kumar <reachb...@gmail.com>
>> wrote:
>> > Thanks for responding. I used the pre built spark binaries meant for
>> > hadoop1,cdh3u5. I do not intend to build spark against a specific
>> > distribution. Irrespective of whether I build my app with the explicit
>> cdh
>> > hadoop client dependency,  I get the same error message. I also verified
>> > that my  app's uber jar had pulled in the cdh hadoop client
>> dependencies.
>> >
>> > On 25-Jul-2014 9:26 pm, "Sean Owen" <so...@cloudera.com> wrote:
>> >>
>> >> This indicates your app is not actually using the version of the HDFS
>> >> client you think. You built Spark from source with the right deps it
>> >> seems, but are you sure you linked to your build in your app?
>> >>
>> >> On Fri, Jul 25, 2014 at 4:32 PM, Bharath Ravi Kumar <
>> reachb...@gmail.com>
>> >> wrote:
>> >> > Any suggestions to  work around this issue ? The pre built spark
>> >> > binaries
>> >> > don't appear to work against cdh as documented, unless there's a
>> build
>> >> > issue, which seems unlikely.
>> >> >
>> >> > On 25-Jul-2014 3:42 pm, "Bharath Ravi Kumar" <reachb...@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >>
>> >> >> I'm encountering a hadoop client protocol mismatch trying to read
>> from
>> >> >> HDFS (cdh3u5) using the pre-build spark from the downloads page
>> (linked
>> >> >> under "For Hadoop 1 (HDP1, CDH3)"). I've also  followed the
>> >> >> instructions at
>> >> >>
>> >> >>
>> http://spark.apache.org/docs/latest/hadoop-third-party-distributions.html
>> >> >> (i.e. building the app against hadoop-client 0.20.2-cdh3u5), but
>> >> >> continue to
>> >> >> see the following error regardless of whether I link the app with
>> the
>> >> >> cdh
>> >> >> client:
>> >> >>
>> >> >> 14/07/25 09:53:43 INFO client.AppClient$ClientActor: Executor
>> updated:
>> >> >> app-20140725095343-0016/1 is now RUNNING
>> >> >> 14/07/25 09:53:43 WARN util.NativeCodeLoader: Unable to load
>> >> >> native-hadoop
>> >> >> library for your platform... using builtin-java classes where
>> >> >> applicable
>> >> >> 14/07/25 09:53:43 WARN snappy.LoadSnappy: Snappy native library not
>> >> >> loaded
>> >> >> Exception in thread "main"
>> org.apache.hadoop.ipc.RPC$VersionMismatch:
>> >> >> Protocol org.apache.hadoop.hdfs.protocol.ClientProtocol version
>> >> >> mismatch.
>> >> >> (client = 61, server = 63)
>> >> >>         at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:401)
>> >> >>         at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>> >> >>
>> >> >>
>> >> >> While I can build spark against the exact hadoop distro version, I'd
>> >> >> rather work with the standard prebuilt binaries, making additional
>> >> >> changes
>> >> >> while building the app if necessary. Any
>> workarounds/recommendations?
>> >> >>
>> >> >> Thanks,
>> >> >> Bharath
>>
>
>

Reply via email to