Dear all, After studying the source code and my environment, I guess the problem is that the hostPort is wrong. On my machine, the hostname will be exported into `blockManager.hostPort` such as wush-home:45678, but the slaves could not resolve the hostname to ip correctly. I am trying to solve the problem of hostname resolving.
On the other hand, is there a way to set the system property `spark.hostPort` so that I could export `192.168.xx.xx:45678` to `spark.hostPort`? Thanks. 2014-03-28 9:32 GMT+08:00 Wush Wu <w...@bridgewell.com>: > Dear Rustagi, > > Thanks for you response. > > As far as I know, the DAG scheduler should be a part of spark. Therefore, > should I do something not mentioned in > http://spark.incubator.apache.org/docs/0.8.1/running-on-mesos.html to > launch the DAG scheduler? > > By the way, I also notice that the user of the submission becomes the > account on my machine. I doubt that there is a permission issue, so the > executor cannot copy the file from spark.execute.uri. After changing the > account to the same username on the mesos cluster, the executor > successfully copy the file from spark.execute.uri, but it reports the > following error message: > > log4j:WARN No appenders could be found for logger > (org.apache.spark.executor.MesosExecutorBackend). > log4j:WARN Please initialize the log4j system properly. > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for > more info. > org.apache.spark.SparkException: Error sending message to > BlockManagerMaster [message = > RegisterBlockManager(BlockManagerId(201403250945-3657629962-5050-10180-16, > pc104, 42356, 0),339585269,Actor[akka://spark/user/BlockManagerActor1])] > at > org.apache.spark.storage.BlockManagerMaster.askDriverWithReply(BlockManagerMaster.scala:174) > at > org.apache.spark.storage.BlockManagerMaster.tell(BlockManagerMaster.scala:139) > at > org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:57) > at org.apache.spark.storage.BlockManager.initialize(BlockManager.scala:127) > at org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:105) > at org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:119) > at > org.apache.spark.SparkEnv$.createFromSystemProperties(SparkEnv.scala:171) > at org.apache.spark.executor.Executor.<init>(Executor.scala:111) > at > org.apache.spark.executor.MesosExecutorBackend.registered(MesosExecutorBackend.scala:58) > Caused by: java.util.concurrent.TimeoutException: Futures timed out after > [10000] milliseconds > at akka.dispatch.DefaultPromise.ready(Future.scala:870) > at akka.dispatch.DefaultPromise.result(Future.scala:874) > at akka.dispatch.Await$.result(Future.scala:74) > at > org.apache.spark.storage.BlockManagerMaster.askDriverWithReply(BlockManagerMaster.scala:160) > ... 8 more > Exception in thread "Thread-0" > > Is there any suggestion to handle the error above? > > Thanks, > Wush > > > 2014-03-28 4:09 GMT+08:00 Mayur Rustagi <mayur.rust...@gmail.com>: > > Yes but you have to maintain connection of that machine to the master >> cluster as the driver with DAG scheduler runs on that machine. >> Regards >> Mayur >> >> Mayur Rustagi >> Ph: +1 (760) 203 3257 >> http://www.sigmoidanalytics.com >> @mayur_rustagi <https://twitter.com/mayur_rustagi> >> >> >> >> On Thu, Mar 27, 2014 at 4:09 AM, Wush Wu <w...@bridgewell.com> wrote: >> >>> Dear all, >>> >>> We have a spark 0.8.1 cluster on mesos 0.15. It works if I submit the >>> job from the master of mesos. That is to say, I spawn the spark shell or >>> launch the scala application on the master of mesos. >>> >>> However, when I submit the job from another machine, the job will lost. >>> The logs shows that the mesos does not copy the >>> spark-0.8.1-incubating.tar.gz to the temporal working directory, so the job >>> lost immediately. Is it possible to submit the job from the machine not >>> belong to mesos cluster? >>> >>> Thanks! >>> >>> Wush >>> >> >> >