I am running it in yarn-client mode and I believe hbase-client is part of the spark-examples-1.2.0-cdh5.3.0-hadoop2.5.0-cdh5.3.0.jar which I am submitting at launch. adding another jstack taken during the hanging - http://pastebin.com/QDQrBw70 - this is of the CoarseGrainedExecutorBackend process - this one is referencing hbase and zookeeper. thanks,Antony.
On Thursday, 25 December 2014, 1:38, Ted Yu <yuzhih...@gmail.com> wrote: bq. "hbase.zookeeper.quorum": "localhost" You are running hbase cluster in standalone mode ?Is hbase-client jar in the classpath ? Cheers On Wed, Dec 24, 2014 at 4:11 PM, Antony Mayi <antonym...@yahoo.com> wrote: I just run it by hand from pyspark shell. here is the steps: pyspark --jars /usr/lib/spark/lib/spark-examples-1.2.0-cdh5.3.0-hadoop2.5.0-cdh5.3.0.jar >>> conf = {"hbase.zookeeper.quorum": "localhost", ... "hbase.mapred.outputtable": "test",... "mapreduce.outputformat.class": "org.apache.hadoop.hbase.mapreduce.TableOutputFormat",... "mapreduce.job.output.key.class": "org.apache.hadoop.hbase.io.ImmutableBytesWritable",... "mapreduce.job.output.value.class": "org.apache.hadoop.io.Writable"}>>> keyConv = "org.apache.spark.examples.pythonconverters.StringToImmutableBytesWritableConverter">>> valueConv = "org.apache.spark.examples.pythonconverters.StringListToPutConverter">>> sc.parallelize([['testkey', 'f1', 'testqual', 'testval']], 1).map(lambda x: (x[0], x)).saveAsNewAPIHadoopDataset(... conf=conf,... keyConverter=keyConv,... valueConverter=valueConv) then it spills few of the INFO level messages about submitting a task etc but then it just hangs. very same code runs ok on spark 1.1.0 - the records gets stored in hbase. thanks,Antony. On Thursday, 25 December 2014, 0:37, Ted Yu <yuzhih...@gmail.com> wrote: I went over the jstack but didn't find any call related to hbase or zookeeper.Do you find anything important in the logs ? Looks like container launcher was waiting for the script to return some result: - at org.apache.hadoop.util.Shell$ShellCommandExecutor.parseExecResult(Shell.java:715) - at org.apache.hadoop.util.Shell.runCommand(Shell.java:524) On Wed, Dec 24, 2014 at 3:11 PM, Antony Mayi <antonym...@yahoo.com> wrote: this is it (jstack of particular yarn container) -> http://pastebin.com/eAdiUYKK thanks, Antony. On Wednesday, 24 December 2014, 16:34, Ted Yu <yuzhih...@gmail.com> wrote: bq. even when testing with the example from the stock hbase_outputformat.py Can you take jstack of the above and pastebin it ? Thanks On Wed, Dec 24, 2014 at 4:49 AM, Antony Mayi <antonym...@yahoo.com.invalid> wrote: Hi, have been using this without any issues with spark 1.1.0 but after upgrading to 1.2.0 saving a RDD from pyspark using saveAsNewAPIHadoopDataset into HBase just hangs - even when testing with the example from the stock hbase_outputformat.py. anyone having same issue? (and able to solve?) using hbase 0.98.6 and yarn-client mode. thanks,Antony.