Re: master attempted to re-register the worker and then took all workers as unregistered

2014-07-08 Thread Cheney Sun
16 AM, Nan Zhu wrote: > Hey, Cheney, > > The problem is still existing? > > Sorry for the delay, I’m starting to look at this issue, > > Best, > > -- > Nan Zhu > > On Tuesday, May 6, 2014 at 10:06 PM, Cheney Sun wrote: > > Hi Nan, > > In worker'

Re: master attempted to re-register the worker and then took all workers as unregistered

2014-07-08 Thread Cheney Sun
Yes, 0.9.1. On Tue, Jul 8, 2014 at 10:26 PM, Nan Zhu wrote: > Hi, Cheney, > > Thanks for the information > > which version are you using, 0.9.1? > > Best, > > -- > Nan Zhu > > On Tuesday, July 8, 2014 at 10:09 AM, Cheney Sun wrote: > > Hi Nan,

Re: error when spark access hdfs with Kerberos enable

2014-07-08 Thread Cheney Sun
Hi Sandy, We are also going to grep data from a security enabled (with kerberos) HDFS in our Spark application. Per you answer, we have to switch Spark on YARN to achieve this. We plan to deploy a different Hadoop cluster(with YARN) only to run Spark. Is it necessary to deploy YARN with security e

works disconnected with master but still keep alive

2014-05-04 Thread Cheney Sun
Hi experts, I set up an Spark cluster in the standalone mode with 10 workers and the version is 0.9.1. I chose the version with the assumption that the latest version is always the most stable one. However, when I unintentionally run an problematic job (such as config the SPARK_HOME with a wrong p

Re: works disconnected with master but still keep alive

2014-05-04 Thread Cheney Sun
No reply, maybe I didn't make it clear. I try to add more information. When the worker node attempts to launch a problematic executor, not only the executor fails to launch but also the worker is removed by master. The worker will try to re-register with master but rejected. In the master log, the

Re: master attempted to re-register the worker and then took all workers as unregistered

2014-05-04 Thread Cheney Sun
Hi Nan, Have you found a way to fix the issue? Now I run into the same problem with version 0.9.1. Thanks, Cheney -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/master-attempted-to-re-register-the-worker-and-then-took-all-workers-as-unregistered-tp553p53

Re: master attempted to re-register the worker and then took all workers as unregistered

2014-05-07 Thread Cheney Sun
Hi Nan, In worker's log, I see the following exception thrown when try to launch on executor. (The SPARK_HOME is wrongly specified on purpose, so there is no such file "/usr/local/spark1/bin/compute-classpath.sh"). After the exception was thrown several times, the worker was requested to kill the

Driver process succeed exiting but web UI shows FAILED

2014-05-11 Thread Cheney Sun
Hi, I'm running the spark 0.9.1 in standalone mod. I submitted one job and the driver succeed running to the end, see the log message below: 2014-05-12 10:34:14,358 - [INFO] (Logging.scala:50) - Finished TID 254 in 19 ms on spark-host007 (progress: 62/63) 2014-05-12 10:34:14,359 - [INFO] (Logging

Is there any problem on the spark mailing list?

2014-05-15 Thread Cheney Sun
I can't receive any spark-user mail since yesterday. Can you guys receive any new mail? -- Cheney -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Is-there-any-problem-on-the-spark-mailing-list-tp5509.html Sent from the Apache Spark User List mailing list

too many temporary app files left after app finished

2014-05-27 Thread Cheney Sun
Hi, We use spark 0.9.1 in standalone mode. We found lots of app temporary files didn't get removed in each worker local file system even while the job was finished. These folder have names such as "app-20140516120842-0203". These files occupied so many disk storage that we have to run a deamon sc