I see - your HBase cluster is separate from Mesos cluster. I somehow got (incorrect) impression that HBase cluster runs on Mesos.
On Tue, Nov 17, 2015 at 7:53 PM, 임정택 <kabh...@gmail.com> wrote: > Ted, > > Could you elaborate, please? > > I maintain separated HBase cluster and Mesos cluster for some reasons, and > I just can make it work via spark-submit or spark-shell / zeppelin with > newly initialized SparkContext. > > Thanks, > Jungtaek Lim (HeartSaVioR) > > 2015-11-17 22:17 GMT+09:00 Ted Yu <yuzhih...@gmail.com>: > >> I am a bit curious: >> Hbase depends on hdfs. >> Has hdfs support for Mesos been fully implemented ? >> >> Last time I checked, there was still work to be done. >> >> Thanks >> >> On Nov 17, 2015, at 1:06 AM, 임정택 <kabh...@gmail.com> wrote: >> >> Oh, one thing I missed is, I built Spark 1.4.1 Cluster with 6 nodes of >> Mesos 0.22.1 H/A (via ZK) cluster. >> >> 2015-11-17 18:01 GMT+09:00 임정택 <kabh...@gmail.com>: >> >>> Hi all, >>> >>> I'm evaluating zeppelin to run driver which interacts with HBase. >>> I use fat jar to include HBase dependencies, and see failures on >>> executor level. >>> I thought it is zeppelin's issue, but it fails on spark-shell, too. >>> >>> I loaded fat jar via --jars option, >>> >>> > ./bin/spark-shell --jars hbase-included-assembled.jar >>> >>> and run driver code using provided SparkContext instance, and see >>> failures from spark-shell console and executor logs. >>> >>> below is stack traces, >>> >>> org.apache.spark.SparkException: Job aborted due to stage failure: Task 55 >>> in stage 0.0 failed 4 times, most recent failure: Lost task 55.3 in stage >>> 0.0 (TID 281, <svr hostname>): java.lang.NoClassDefFoundError: Could not >>> initialize class org.apache.hadoop.hbase.client.HConnectionManager >>> at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:197) >>> at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:159) >>> at >>> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) >>> at >>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:128) >>> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104) >>> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66) >>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >>> at >>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >>> at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) >>> at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) >>> at org.apache.spark.scheduler.Task.run(Task.scala:70) >>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:745) >>> >>> Driver stacktrace: >>> at >>> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263) >>> at >>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) >>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >>> at >>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1263) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) >>> at scala.Option.foreach(Option.scala:236) >>> at >>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730) >>> at >>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1457) >>> at >>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418) >>> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) >>> >>> >>> 15/11/16 18:59:57 ERROR Executor: Exception in task 14.0 in stage 0.0 (TID >>> 14) >>> java.lang.ExceptionInInitializerError >>> at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:197) >>> at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:159) >>> at >>> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) >>> at >>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:128) >>> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104) >>> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66) >>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >>> at >>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >>> at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) >>> at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) >>> at org.apache.spark.scheduler.Task.run(Task.scala:70) >>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:745) >>> Caused by: java.lang.RuntimeException: hbase-default.xml file seems to be >>> for and old version of HBase (null), this version is 0.98.6-cdh5.2.0 >>> at >>> org.apache.hadoop.hbase.HBaseConfiguration.checkDefaultsVersion(HBaseConfiguration.java:73) >>> at >>> org.apache.hadoop.hbase.HBaseConfiguration.addHbaseResources(HBaseConfiguration.java:105) >>> at >>> org.apache.hadoop.hbase.HBaseConfiguration.create(HBaseConfiguration.java:116) >>> at >>> org.apache.hadoop.hbase.client.HConnectionManager.<clinit>(HConnectionManager.java:222) >>> ... 18 more >>> >>> >>> Please note that it runs smoothly on spark-submit. >>> >>> Btw, if issue is that hbase-default.xml is not properly loaded (maybe >>> because of classloader), it seems to run properly on driver level. >>> >>> import org.apache.hadoop.hbase.HBaseConfiguration >>> val conf = HBaseConfiguration.create() >>> println(conf.get("hbase.defaults.for.version")) >>> >>> It prints "0.98.6-cdh5.2.0". >>> >>> I'm using Spark-1.4.1-hadoop-2.4-bin, and zeppelin 0.5.5, and HBase >>> 0.98.6-CDH5.2.0. >>> >>> Thanks in advance! >>> >>> Best, >>> Jungtaek Lim (HeartSaVioR) >>> >> >> >> >> -- >> Name : 임 정택 >> Blog : http://www.heartsavior.net / http://dev.heartsavior.net >> Twitter : http://twitter.com/heartsavior >> LinkedIn : http://www.linkedin.com/in/heartsavior >> >> > > > -- > Name : 임 정택 > Blog : http://www.heartsavior.net / http://dev.heartsavior.net > Twitter : http://twitter.com/heartsavior > LinkedIn : http://www.linkedin.com/in/heartsavior >