Sherry Xue created ZEPPELIN-1560:
------------------------------------

             Summary: NPE in SparkInterpreter.java of Zeppelin 0.6.2 
interpreter when running Zeppelin 0.6.2 with Spark 2.0.1 or Spark 1.6.1
                 Key: ZEPPELIN-1560
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-1560
             Project: Zeppelin
          Issue Type: Bug
    Affects Versions: 0.6.2
            Reporter: Sherry Xue


When using the most recent released Zeppelin 0.6.2 with Spark 2.0.1 or Spark 
1.6.1, occasionally will run into this NPE problem:
When Running Spark 2.0.1(built with scala 2.11) and Zeppelin 0.6.2, the spark 
interpreter will popped up as:
INFO [2016-10-17 21:11:01,575] ({pool-2-thread-2} Logging.scala[logInfo]:54) - 
Registered BlockManager BlockManagerId(driver, 9.110.72.25, 52000)
 INFO [2016-10-17 21:11:01,876] ({pool-2-thread-2} 
ContextHandler.java[doStart]:744) - Started 
o.s.j.s.ServletContextHandler@-75284774{/metrics/json,null,AVAILABLE}
 INFO [2016-10-17 21:11:01,941] ({pool-2-thread-2} Logging.scala[logInfo]:54) - 
SchedulerBackend is ready for scheduling beginning after reached 
minRegisteredResourcesRatio: 0.0
 WARN [2016-10-17 21:11:01,974] ({pool-2-thread-2} 
Logging.scala[logWarning]:66) - Use an existing SparkContext, some 
configuration may not take effect.
 INFO [2016-10-17 21:11:01,996] ({pool-2-thread-2} 
ContextHandler.java[doStart]:744) - Started 
o.s.j.s.ServletContextHandler@478a3351{/SQL,null,AVAILABLE}
 INFO [2016-10-17 21:11:01,997] ({pool-2-thread-2} 
ContextHandler.java[doStart]:744) - Started 
o.s.j.s.ServletContextHandler@-50741f8a{/SQL/json,null,AVAILABLE}
 INFO [2016-10-17 21:11:01,998] ({pool-2-thread-2} 
ContextHandler.java[doStart]:744) - Started 
o.s.j.s.ServletContextHandler@-2a1e21c3{/SQL/execution,null,AVAILABLE}
 INFO [2016-10-17 21:11:01,999] ({pool-2-thread-2} 
ContextHandler.java[doStart]:744) - Started 
o.s.j.s.ServletContextHandler@1522c673{/SQL/execution/json,null,AVAILABLE}
 INFO [2016-10-17 21:11:02,001] ({pool-2-thread-2} 
ContextHandler.java[doStart]:744) - Started 
o.s.j.s.ServletContextHandler@667957f6{/static/sql,null,AVAILABLE}
 INFO [2016-10-17 21:11:02,020] ({pool-2-thread-2} Logging.scala[logInfo]:54) - 
Warehouse path is 
'/opt/zeppelin062/zeppelin062/zeppelin-0.6.2-bin-all/spark-warehouse'.
 INFO [2016-10-17 21:11:02,037] ({pool-2-thread-2} 
SparkInterpreter.java[createSparkSession]:338) - Created Spark session with 
Hive support
ERROR [2016-10-17 21:11:02,473] ({pool-2-thread-2} Job.java[run]:182) - Job 
failed
java.lang.NullPointerException
        at 
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:843)
        at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
        at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
        at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:341)
        at org.apache.zeppelin.scheduler.Job.run(Job.java:176)
        at 
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:522)
        at java.util.concurrent.FutureTask.run(FutureTask.java:277)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:191)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1153)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.lang.Thread.run(Thread.java:785)
 INFO [2016-10-17 21:11:02,478] ({pool-2-thread-2} 
SchedulerFactory.java[jobFinished]:137) - Job remoteInterpretJob_1476709844602 
finished by scheduler org.apache.zeppelin.spark.SparkInterpreter-847950271
ERROR [2016-10-17 21:11:05,536] ({pool-1-thread-3} 
TThreadPoolServer.java[run]:296) - Error occurred during processing of message.
java.lang.NullPointerException

For Spark 1.6.1(built with scala 2.10), the spark-interpreter log shows:
 INFO [2016-10-18 12:59:54,993] ({pool-2-thread-2} Log4JLogger.java[info]:77) - 
The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as 
"embedded-only" so does not have its own datastore table.
 INFO [2016-10-18 12:59:55,231] ({pool-2-thread-2} 
SessionState.java[createPath]:641) - Created local directory: 
/tmp/e8889df0-617d-46c0-b4fc-1aebda48ce56_resources
 INFO [2016-10-18 12:59:55,263] ({pool-2-thread-2} 
SessionState.java[createPath]:641) - Created HDFS directory: 
/tmp/hive/root/e8889df0-617d-46c0-b4fc-1aebda48ce56
 INFO [2016-10-18 12:59:55,290] ({pool-2-thread-2} 
SessionState.java[createPath]:641) - Created local directory: 
/tmp/root/e8889df0-617d-46c0-b4fc-1aebda48ce56
 INFO [2016-10-18 12:59:55,294] ({pool-2-thread-2} 
SessionState.java[createPath]:641) - Created HDFS directory: 
/tmp/hive/root/e8889df0-617d-46c0-b4fc-1aebda48ce56/_tmp_space.db
ERROR [2016-10-18 12:59:55,602] ({pool-2-thread-2} Job.java[run]:182) - Job 
failed
java.lang.NullPointerException
        at 
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:843)
        at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
        at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
        at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:341)
        at org.apache.zeppelin.scheduler.Job.run(Job.java:176)
        at 
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:522)
        at java.util.concurrent.FutureTask.run(FutureTask.java:277)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:191)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1153)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.lang.Thread.run(Thread.java:785)
 INFO [2016-10-18 12:59:55,606] ({pool-2-thread-2} 
SchedulerFactory.java[jobFinished]:137) - Job remoteInterpretJob_1476766736647 
finished by scheduler org.apache.zeppelin.spark.SparkInterpreter-1949792648

It all shows line 843 NPE error in SparkInterpreter.java that 
  interpret("@transient val _binder = new java.util.HashMap[String, Object]()");
      Map<String, Object> binder;
      if (Utils.isScala2_10()) {
        binder = (Map<String, Object>) getValue("_binder");
      } else {
        binder = (Map<String, Object>) getLastObject();
      }
    binder.put("sc", sc);      <------line 843 
    binder.put("sqlc", sqlc);
    binder.put("z", z);
So is there a possibility that binder would be null, and that's the reason why 
NPE is out.
My configuration for spark-env.sh listed as follows:
JAVA_HOME=/opt/ibm/platform1018/jre/3.5/linux-x86_64
for zeppelin-env.sh:
export JAVA_HOME=/opt/ibm/platform1018/jre/3.5/linux-x86_64
export MASTER=spark://rhel-25.cn.ibm.com:7077
export SPARK_HOME=/opt/zeppelin062/spark201/spark-2.0.1-bin-hadoop2.7
here need to pay attention that one difference for this case is we are using 
IBM java, and this problem does not occur always, with same configuration, we 
sometimes meet it sometimes no such issue, which is weird. and when we meet 
this problem, restart the interpreter may works.  
the java info are as follows:
IBM J9 VM (build 2.8, JRE 1.8.0 Linux amd64-64 Compressed References 
20160816_315341 (JIT enabled, AOT enabled)
J9VM - R28_20160816_1459_B315341
JIT  - tr.r14.java.green_20160726_121883
GC   - R28_20160816_1459_B315341_CMPRSS
J9CL - 20160816_315341)
JCL - 20160816_01 based on Oracle jdk8u101-b13



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to