Re: saveAsNewAPIHadoopDataset against hbase hanging in pyspark 1.2.0

2014-12-24 Thread Antony Mayi
also hbase itself works ok: hbase(main):006:0> scan 'test'ROW                            COLUMN+CELL                                                                             key1                          column=f1:asd, timestamp=1419463092904, value=456                                      1

Re: saveAsNewAPIHadoopDataset against hbase hanging in pyspark 1.2.0

2014-12-24 Thread Antony Mayi
I am running it in yarn-client mode and I believe hbase-client is part of the  spark-examples-1.2.0-cdh5.3.0-hadoop2.5.0-cdh5.3.0.jar which I am submitting at launch. adding another jstack taken during the hanging - http://pastebin.com/QDQrBw70 - this is of the CoarseGrainedExecutorBackend proces

Re: saveAsNewAPIHadoopDataset against hbase hanging in pyspark 1.2.0

2014-12-24 Thread Ted Yu
bq. "hbase.zookeeper.quorum": "localhost" You are running hbase cluster in standalone mode ? Is hbase-client jar in the classpath ? Cheers On Wed, Dec 24, 2014 at 4:11 PM, Antony Mayi wrote: > I just run it by hand from pyspark shell. here is the steps: > > pyspark --jars > /usr/lib/spark/lib/

Re: saveAsNewAPIHadoopDataset against hbase hanging in pyspark 1.2.0

2014-12-24 Thread Antony Mayi
I just run it by hand from pyspark shell. here is the steps: pyspark --jars /usr/lib/spark/lib/spark-examples-1.2.0-cdh5.3.0-hadoop2.5.0-cdh5.3.0.jar >>> conf = {"hbase.zookeeper.quorum": "localhost", ...         "hbase.mapred.outputtable": "test",...         "mapreduce.outputformat.class": "org

Re: saveAsNewAPIHadoopDataset against hbase hanging in pyspark 1.2.0

2014-12-24 Thread Ted Yu
I went over the jstack but didn't find any call related to hbase or zookeeper. Do you find anything important in the logs ? Looks like container launcher was waiting for the script to return some result: 1. at org.apache.hadoop.util.Shell$ShellCommandExecutor.parseExecResult(Shell

Re: saveAsNewAPIHadoopDataset against hbase hanging in pyspark 1.2.0

2014-12-24 Thread Antony Mayi
this is it (jstack of particular yarn container) -> http://pastebin.com/eAdiUYKK thanks, Antony. On Wednesday, 24 December 2014, 16:34, Ted Yu wrote: bq. even when testing with the example from the stock hbase_outputformat.py Can you take jstack of the above and pastebin it ? Thanks

Re: saveAsNewAPIHadoopDataset against hbase hanging in pyspark 1.2.0

2014-12-24 Thread Ted Yu
bq. even when testing with the example from the stock hbase_outputformat.py Can you take jstack of the above and pastebin it ? Thanks On Wed, Dec 24, 2014 at 4:49 AM, Antony Mayi wrote: > Hi, > > have been using this without any issues with spark 1.1.0 but after > upgrading to 1.2.0 saving a R

saveAsNewAPIHadoopDataset against hbase hanging in pyspark 1.2.0

2014-12-24 Thread Antony Mayi
Hi, have been using this without any issues with spark 1.1.0 but after upgrading to 1.2.0 saving a RDD from pyspark using saveAsNewAPIHadoopDataset into HBase just hangs - even when testing with the example from the stock hbase_outputformat.py. anyone having same issue? (and able to solve?) using