@qingyang, spark 0.9.0 works for me perfectly when accessing (read/write) data on hdfs. BTW, if you look at pom.xml, you have to choose yarn profile to compile spark, so that it won't include protobuf 2.4.1 in your final jars. Here is the command line we use to compile spark with hadoop 2.2:
mvn -U -Dyarn.version=2.2.0 -Dhadoop.version=2.2.0 -Pyarn -DskipTests package Thanks -Shengzhe On Wed, Mar 26, 2014 at 12:04 AM, qingyang li <liqingyang1...@gmail.com>wrote: > Egor, i encounter the same problem which you have asked in this thread: > > > http://mail-archives.apache.org/mod_mbox/spark-user/201402.mbox/%3CCAMrx5DwJVJS0g_FE7_2qwMu4Xf0y5VfV=tlyauv2kh5v4k6...@mail.gmail.com%3E > > have you fixed this problem? > > i am using shark to read a table which i have created on hdfs. > > i found in shark lib_managed directory there are two protobuf*.jar: > [root@bigdata001 shark-0.9.0]# find . -name "proto*.jar" > > ./lib_managed/jars/org.spark-project.protobuf/protobuf-java/protobuf-java-2.4.1-shaded.jar > > ./lib_managed/bundles/com.google.protobuf/protobuf-java/protobuf-java-2.5.0.jar > > > my hadoop is using protobuf-java-2.5.0.jar . >