Hi, The issue was due to meos version mismatch as I am using latest mesos 0.17.0, but spark uses 0.13.0. Fixed by updating the SparkBuild.scala to latest version. However I am now faced with errors in mesos worker threads. I tried after after upgrading spark to 0.9.1., issues persists. Thanks for the upgrade info. Please let me know if this is a configuration issue of Hadoop. The configuration files are attached herewith. I have configured HDFS as hdfs://master:9000.
Errorlog in the per atsk folder is follows: Failed to copy from HDFS: hadoop fs -copyToLocal 'hdfs://master/user/hduser/spark-0.9.1.tar.gz' './spark-0.9.1.tar.gz' copyToLocal: Call From slave03/192.168.0.100 to master:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused Failed to fetch executors Fetching resources into '/tmp/mesos/slaves/201404021638-2315299008-5050-6539-1/frameworks/201404041508-2315299008-5050-19945-0000/executors/201404021638-2315299008-5050-6539-1/runs/efd1fc4c-ec01-470c-9dd0-c34cc8052ffe' Fetching resource 'hdfs://master/user/hduser/spark-0.9.1.tar.gz' The worker output is as follows: I0403 20:21:11.007439 24158 slave.cpp:837] Launching task 158 for framework 201404032011-2315299008-5050-13793-0000 I0403 20:21:11.009099 24158 slave.cpp:947] Queuing task '158' for executor 201404021638-2315299008-5050-6539-1 of framework '201404032011-2315299008-5050-13793-0000 I0403 20:21:11.009193 24158 slave.cpp:728] Got assigned task 159 for framework 201404032011-2315299008-5050-13793-0000 I0403 20:21:11.009204 24161 process_isolator.cpp:100] Launching 201404021638-2315299008-5050-6539-1 (cd spark-0*; ./sbin/spark-executor) in /tmp/mesos/slaves/201404021638-2315299008-5050-6539-1/frameworks/201404032011-2315299008-5050-13793-0000/executors/201404021638-2315299008-5050-6539-1/runs/99e767c5-5de5-43ba-aebc-86c119b4878f with resources mem(*):512; cpus(*):1' for framework 201404032011-2315299008-5050-13793-0000 I0403 20:21:11.009348 24158 slave.cpp:837] Launching task 159 for framework 201404032011-2315299008-5050-13793-0000 I0403 20:21:11.009418 24158 slave.cpp:947] Queuing task '159' for executor 201404021638-2315299008-5050-6539-1 of framework '201404032011-2315299008-5050-13793-0000 I0403 20:21:11.010208 24161 process_isolator.cpp:163] Forked executor at 25380 I0403 20:21:11.013329 24159 slave.cpp:2090] Monitoring executor 201404021638-2315299008-5050-6539-1 of framework 201404032011-2315299008-5050-13793-0000 forked at pid 25380 I0403 20:21:12.994786 24161 process_isolator.cpp:482] Telling slave of terminated executor '201404021638-2315299008-5050-6539-1' of framework 201404032011-2315299008-5050-13793-0000 I0403 20:21:12.998713 24163 slave.cpp:2146] Executor '201404021638-2315299008-5050-6539-1' of framework 201404032011-2315299008-5050-13793-0000 has exited with status 255 I0403 20:21:13.000233 24163 slave.cpp:1757] Handling status update TASK_LOST (UUID: 2bdef9d2-c323-4d21-9b33-fedf7f8e9729) for task 158 of framework 201404032011-2315299008-5050-13793-0000 from @0.0.0.0:0 I0403 20:21:13.000493 24162 status_update_manager.cpp:314] Received status update TASK_LOST (UUID: 2bdef9d2-c323-4d21-9b33-fedf7f8e9729) for task 158 of framework 201404032011-2315299008-5050-13793-0000 I0403 20:21:13.000701 24162 status_update_manager.cpp:367] Forwarding status update TASK_LOST (UUID: 2bdef9d2-c323-4d21-9b33-fedf7f8e9729) for task 158 of framework 201404032011-2315299008-5050-13793-0000 to master@192.168.0.138:5050 I0403 20:21:13.001829 24163 slave.cpp:1757] Handling status update TASK_LOST (UUID: 4b9b4f38-a83d-40df-8c1c-a7d857c0ce86) for task 159 of framework 201404032011-2315299008-5050-13793-0000 from @0.0.0.0:0 I0403 20:21:13.002003 24162 status_update_manager.cpp:314] Received status update TASK_LOST (UUID: 4b9b4f38-a83d-40df-8c1c-a7d857c0ce86) for task 159 of framework 201404032011-2315299008-5050-13793-0000 I0403 20:21:13.002125 24162 status_update_manager.cpp:367] Forwarding status update TASK_LOST (UUID: 4b9b4f38-a83d-40df-8c1c-a7d857c0ce86) for task 159 of framework 201404032011-2315299008-5050-13793-0000 to master@192.168.0.138:5050 I0403 20:21:13.006757 24157 status_update_manager.cpp:392] Received status update acknowledgement (UUID: 2bdef9d2-c323-4d21-9b33-fedf7f8e9729) for task 158 of framework 201404032011-2315299008-5050-13793-0000 I0403 20:21:13.006875 24157 status_update_manager.cpp:392] Received status update acknowledgement (UUID: 4b9b4f38-a83d-40df-8c1c-a7d857c0ce86) for task 159 of framework 201404032011-2315299008-5050-13793-0000 I0403 20:21:13.006990 24156 slave.cpp:2281] Cleaning up executor '201404021638-2315299008-5050-6539-1' of framework 201404032011-2315299008-5050-13793-0000 I0403 20:21:13.007307 24162 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201404021638-2315299008-5050-6539-1/frameworks/201404032011-2315299008-5050-13793-0000/executors/201404021638-2315299008-5050-6539-1/runs/99e767c5-5de5-43ba-aebc-86c119b4878f' for gc 6.99999991645926days in the future I0403 20:21:13.007310 24156 slave.cpp:2352] Cleaning up framework 201404032011-2315299008-5050-13793-0000 I0403 20:21:13.007407 24162 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201404021638-2315299008-5050-6539-1/frameworks/201404032011-2315299008-5050-13793-0000/executors/201404021638-2315299008-5050-6539-1' for gc 6.99999991568296days in the future I0403 20:21:13.007586 24157 status_update_manager.cpp:276] Closing status update streams for framework 201404032011-2315299008-5050-13793-0000 W0403 20:21:13.008318 24161 process_isolator.cpp:268] Failed to kill the process tree rooted at pid 25380: Failed to find process 25380 I0403 20:21:13.008379 24161 process_isolator.cpp:301] Asked to update resources for an unknown/killed executor '201404021638-2315299008-5050-6539-1' of framework 201404032011-2315299008-5050-13793-0000 I0403 20:21:13.008420 24161 process_isolator.cpp:301] Asked to update resources for an unknown/killed executor '201404021638-2315299008-5050-6539-1' of framework 201404032011-2315299008-5050-13793-0000 I0403 20:21:13.007647 24156 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201404021638-2315299008-5050-6539-1/frameworks/201404032011-2315299008-5050-13793-0000' for gc 6.99999991207111days in the future Thanks and regards Gino Mathews K From: felix [mailto:cnwe...@gmail.com] Sent: Thursday, April 03, 2014 4:17 PM To: u...@spark.incubator.apache.org Subject: Re: Error when run Spark on mesos You can download this tarball to replace the 0.9.0 one: wget https://github.com/apache/spark/archive/v0.9.1-rc3.tar.gz just compile it and test it ! 2014-04-03 18:41 GMT+08:00 Gino Mathews [via Apache Spark User List] <[hidden email]</user/SendEmail.jtp?type=node&node=3703&i=0>>: Subject: RE: Error when run Spark on mesos Hi, I have installed Spark 0.9.0 on Ubuntu 12.04 LTS with Hadoop 2.2.0 and able to successfully run few apps in a 1+2 stand alone configuration. I tried both standalone apps as well as spark-shell. But when I tried various mesos version from 0.13.0 to 0.17.0, the stand alone app as well as spark-shell are failing with segmentation fault during the initialization. I have n’t seen the spark app trying to connect to mesos anyway. The invocation log of spark-shell is pasted below. Btw is there any documentation on how to upgrade to spark to 0.9.1. Thanks and regards Gino Mathews K From: panfei [mailto:cnwe...@gmail.com] Sent: Thursday, April 03, 2014 11:37 AM To: user@spark.apache.org<mailto:user@spark.apache.org> Subject: Re: Error when run Spark on mesos after upgrading to 0.9.1 , everything goes well now. thanks for the reply.
core-site.xml
Description: core-site.xml
hdfs-site.xml
Description: hdfs-site.xml
mapred-site.xml
Description: mapred-site.xml
yarn-site.xml
Description: yarn-site.xml