I have this issue building with spark-1.5 and hadoop-2.6.
This comes from clashing fasterxml.jacskson dependencies - the one provided in the spark assembly and the one provided by com.amazonaws:aws-java-sdk-s3

The fix is to exclude fasterxml in the zeppelin-zengine pom.xml or to manually remove the jar (rm zeppelin-server/target/lib/jackson-* and rm zeppelin-zengine/target/lib/jackson-*).

If you do this, I am not sure the sync to S3 will still work (didn't test).

I we want to PR this, we first need to validate S3 sync is still working.


  <dependency>
      <groupId>com.amazonaws</groupId>
      <artifactId>aws-java-sdk-s3</artifactId>
      <version>1.10.1</version>
      <exclusions>
        <exclusion>
          <groupId>com.fasterxml.jackson.core</groupId>
          <artifactId>*</artifactId>
        </exclusion>
      </exclusions>
  </dependency>


On 22/11/15 02:55, Hyung Sung Shim wrote:
Hello.

If you use cdh for hadoop can you try build order like 'mvn clean
package -Pvendor-repo  -DskipTests -Pspark-1.5 -Dspark.version=1.5.2
-Dhadoop.version=2.6.0-mr1-cdh5.4.8' ?

I hope this is help.

2015-11-22 6:55 GMT+09:00 Timur Shenkao <[email protected]
<mailto:[email protected]>>:

    Hi!

    I use CentOS 6.7 + Spark 1.5.2 Standalone + Cloudera Hadoop 5.4.8 on
    the same cluster. I can't use Mesos or Spark on YARN.
    I decided to try Zeppelin. I tried to use binaries,  to build from
    sources with different parameters.
    At last, I built version 0.6.0 so:
    mvn clean package  –DskipTests  -Pspark-1.5 -Phadoop-2.6 -Pyarn
    -Ppyspark -Pbuild-distr

    But constantly get the error:

    com.fasterxml.jackson.databind.JsonMappingException: Could not find
    creator property with name 'id' (in class
    org.apache.spark.rdd.RDDOperationScope) at [Source:
    {"id":"0","name":"parallelize"}; line: 1, column: 1] at
    
com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148)
    at
    
com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:843)
    at
    
com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.addBeanProps(BeanDeserializerFactory.java:533)
    at
    
com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.buildBeanDeserializer(BeanDeserializerFactory.java:220)
    at
    
com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.createBeanDeserializer(BeanDeserializerFactory.java:143)
    at
    
com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:409)
    at
    
com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:358)
    at
    
com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:265)
    at
    
com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:245)
    at
    
com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:143)
    at
    
com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:439)
    at
    
com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:3666)
    at
    
com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3558)
    at
    
com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2578)
    at
    org.apache.spark.rdd.RDDOperationScope$.fromJson(RDDOperationScope.scala:82)
    at org.apache.spark.rdd.RDD$$anonfun$34.apply(RDD.scala:1603) at
    org.apache.spark.rdd.RDD$$anonfun$34.apply(RDD.scala:1603) at
    scala.Option.map(Option.scala:145) at
    org.apache.spark.rdd.RDD.<init>(RDD.scala:1603) at
    
org.apache.spark.rdd.ParallelCollectionRDD.<init>(ParallelCollectionRDD.scala:85)
    at
    
org.apache.spark.SparkContext$$anonfun$parallelize$1.apply(SparkContext.scala:725)
    at
    
org.apache.spark.SparkContext$$anonfun$parallelize$1.apply(SparkContext.scala:723)
    at
    
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
    at
    
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
    at org.apache.spark.SparkContext.withScope(SparkContext.scala:709)
    at org.apache.spark.SparkContext.parallelize(SparkContext.scala:723)
    at
    $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:38)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:40) at
    $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:42) at
    $iwC$$iwC$$iwC$$iwC$$i
    ...
    and so on.

    My code is:
    %spark
    import org.apache.spark.sql._
    val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

    case class Contact(name: String, phone: String)
    case class Person(name: String, age: Int, contacts: Seq[Contact])

    val records = (1 to 100).map { i =>;
    Person(s"name_$i", i, (0 to 1).map { m => Contact(s"contact_$m",
    s"phone_$m") })
    }

    Then, it fails after the following line:
    sc.parallelize(records).toDF().write.format("orc").save("people")

    In spark-shell, this code works perfectly, so problem is in Zeppelin.

    By the way, your own tutorial gives the same error:

    // load bank data
    val bankText = sc.parallelize(
         IOUtils.toString(
              new
    URL("https://s3.amazonaws.com/apache-zeppelin/tutorial/bank/bank.csv";),
              Charset.forName("utf8")).split("\n"))

    case class Bank(age: Integer, job: String, marital: String,
    education: String, balance: Integer)

    val bank = bankText.map(s => s.split(";")).filter(s => s(0) !=
    "\"age\"").map(
         s => Bank(s(0).toInt,
                 s(1).replaceAll("\"", ""),
                 s(2).replaceAll("\"", ""),
                 s(3).replaceAll("\"", ""),
                 s(5).replaceAll("\"", "").toInt
             )
    ).toDF()
    bank.registerTempTable("bank")


    How to fix it? Change some dependency in pom.xml?




--

본문 이미지 1

(주)엔에프랩  |  콘텐츠서비스팀 |  팀장 심형성

*E. hsshim*@nflabs.com <mailto:[email protected]>

*T.*02-3458-9650 *M. *010-4282-1230

*A.* 서울특별시 강남구 논현동 216-2 하림빌딩 2층 NFLABS

Reply via email to