Hello, I’m getting a Json mapping exception errors when trying to use any cluster map/reduce operations on Zeppelin with Apache Spark, running on Mesos. Could somebody provide guidance? I feel my configuration is correct according to documentation I’ve read, and I can’t seem to figure out why the reduce commands are failing.
Here are three (Scala) commands I tried in a fresh Zeppelin notebook, the first two work fine and the third fails: > sc.getConf.getAll res0: Array[(String, String)] = Array((spark.submit.pyArchives,pyspark.zip:py4j-0.8.2.1-src.zip), (spark.home,/cluster/spark), (spark.executor.memory,512m), (spark.files,file:/cluster/spark/python/lib/pyspark.zip,file:/cluster/spark/python/lib/py4j-0.8.2.1-src.zip), (spark.repl.class.uri,http://<IP_HIDDEN>:<PORT_HIDDEN>), (args,""), (zeppelin.spark.concurrentSQL,false), (spark.fileserver.uri,http://<IP_HIDDEN>:<PORT_HIDDEN>), (zeppelin.pyspark.python,python), (spark.scheduler.mode,FAIR), (zeppelin.spark.maxResult,1000), (spark.executor.id,driver), (spark.driver.port,<PORT_HIDDEN>), (zeppelin.dep.localrepo,local-repo), (spark.app.id,20151007-143704-2255525248-5050-29909-0013), (spark.externalBlockStore.folderName,spark-e7edd394-1618-4c18-b76d-9... > (1 to 10).reduce(_ + _) res5: Int = 55 > sc.parallelize(1 to 10).reduce(_ + _) com.fasterxml.jackson.databind.JsonMappingException: Could not find creator property with name 'id' (in class org.apache.spark.rdd.RDDOperationScope) at [Source: {"id":"1","name":"parallelize"}; line: 1, column: 1] at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148) at com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:843) at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.addBeanProps(BeanDeserializerFactory.java:533) at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.buildBeanDeserializer(BeanDeserializerFactory.java:220) at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.createBeanDeserializer(BeanDeserializerFactory.java:143) at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:409) at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:358) at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:265) at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:245) at com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:143) at com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:439) at com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:3666) at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3558) at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2578) at org.apache.spark.rdd.RDDOperationScope$.fromJson(RDDOperationScope.scala:82) at org.apache.spark.rdd.RDD$$anonfun$34.apply(RDD.scala:1490) at org.apache.spark.rdd.RDD$$anonfun$34.apply(RDD.scala:1490) at scala.Option.map(Option.scala:145) at org.apache.spark.rdd.RDD.<init>(RDD.scala:1490) at org.apache.spark.rdd.ParallelCollectionRDD.<init>(ParallelCollectionRDD.scala:85) at org.apache.spark.SparkContext$$anonfun$parallelize$1.apply(SparkContext.scala:697) at org.apache.spark.SparkContext$$anonfun$parallelize$1.apply(SparkContext.scala:695) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108) at org.apache.spark.SparkContext.withScope(SparkContext.scala:681) at org.apache.spark.SparkContext.parallelize(SparkContext.scala:695) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:24) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:29) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:31) at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33) at $iwC$$iwC$$iwC$$iwC.<init>(<console>:35) at $iwC$$iwC$$iwC.<init>(<console>:37) at $iwC$$iwC.<init>(<console>:39) at $iwC.<init>(<console>:41) at <init>(<console>:43) at .<init>(<console>:47) at .<clinit>(<console>) at .<init>(<console>:7) at .<clinit>(<console>) at $print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at org.apache.zeppelin.spark.SparkInterpreter.interpretInput(SparkInterpreter.java:610) at org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:586) at org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:579) at org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) My configuration is as below: -------------------------------- Spark 1.4.1 Mesos 0.21.0 Zepplin 0.5.0-incubating Hadoop 2.4.0 Cluster: 4 nodes, CentOS 6.7 zepplin-env.sh: ----------------- export MASTER=mesos://master:5050 export SPARK_MASTER=mesos://master:5050 export MESOS_NATIVE_JAVA_LIBRARY=/cluster/mesos/build/src/.libs/libmesos.so export MESOS_NATIVE_LIBRARY=/cluster/mesos/build/src/.libs/libmesos.so export SPARK_EXECUTOR_URI=hdfs:///spark/spark-1.4.1-bin-hadoop2.4.tgz export ZEPPELIN_JAVA_OPTS="-Dspark.executor.uri=hdfs:///spark/spark-1.4.1-bin-hadoop2.4.tgz" export SPARK_PID_DIR=/tmp export SPARK_LOCAL_DIRS=/cluster/spark/spark_tmp export HADOOP_CONF_DIR=/cluster/hadoop-2.4.0/etc/hadoop Note: zepplin was built from source using: mvn clean package -Pspark-1.4 -Dhadoop.version=2.4.0 -Phadoop-2.4 -DskipTests Thanks very much, --- Rishi Verma NASA Jet Propulsion Laboratory California Institute of Technology