Hi all,
I have build Shark-0.9.1 using sbt using the below command:
*SPARK_HADOOP_VERSION=2.0.0-mr1-cdh4.6.0 sbt/sbt assembly*
My Hadoop cluster is also having version 2.0.0-mr1-cdh4.6.0.
But when I try to execute the below command from Spark shell,which reads a
file from HDFS, I get the "IPC version mismatch- IPC version 7 on server
versus IPC version 4" on client error on org.apache.hadoop.hdfs.DFSClient
class.
*scala> val s = sc.textFile("hdfs://host:port/test.txt")scala>
s.count()14/06/10 23:42:59 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable14/06/10 23:42:59 WARN snappy.LoadSnappy: Snappy native library
not loadedorg.apache.hadoop.ipc.RemoteException: Server IPC version 7
cannot communicate with client version 4 at
org.apache.hadoop.ipc.Client.call(Client.java:1070) at
org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) at
com.sun.proxy.$Proxy9.getProtocolVersion(Unknown Source) at
org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396) at
org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)*
at
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
at
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:176)
at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)
Apparently this error is because of version mismatch of the hadoop-hdfs jar
between client (one referred by Spark) and server(hadoop cluster).But what
I don't understand is why is this mismatch (since I had built Spark with
the correct Hadoop version).
Any suggestions would be highly appreciated.
Thanks
Bijoy