Thanks for the response. I'll admit I'm rather new to Mesos. Due to the nature of my setup I can't use the Mesos web portal effectively because I'm not connected by VPN, so the local network links from the mesos-master dashboard I SSH tunnelled aren't working.
Anyway, I was able to dig up some logs for a failed job (framework?) run on one of my slaves "20150322-040336-606645514-5050-2744-0037" $ cat mesos-slave.INFO | grep 20150322-040336-606645514-5050-2744-0037 I0329 20:34:26.004115 2524 slave.cpp:1083] Got assigned task 1 for framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:26.004812 2524 slave.cpp:1193] Launching task 1 for framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:26.005879 2524 slave.cpp:3997] Launching executor 1 of framework 20150322-040336-606645514-5050-2744-0037 in work directory '/tmp/mesos/slaves/20150322-040336-606645514-5050-2744-S1/frameworks/20150322-040336-606645514-5050-2744-0037/executors/1/runs/79cf96ba-bf58-45cd-927b-f6c864f6e44b' I0329 20:34:26.006145 2524 slave.cpp:1316] Queuing task '1' for executor 1 of framework '20150322-040336-606645514-5050-2744-0037 I0329 20:34:26.006722 2531 containerizer.cpp:424] Starting container '79cf96ba-bf58-45cd-927b-f6c864f6e44b' for executor '1' of framework '20150322-040336-606645514-5050-2744-0037' I0329 20:34:26.089171 2529 slave.cpp:2840] Monitoring executor '1' of framework '20150322-040336-606645514-5050-2744-0037' in container '79cf96ba-bf58-45cd-927b-f6c864f6e44b' I0329 20:34:26.108610 2529 slave.cpp:1860] Got registration for executor '1' of framework 20150322-040336-606645514-5050-2744-0037 from executor(1)@10.217.7.180:52410 I0329 20:34:26.109136 2529 slave.cpp:1979] Flushing queued task 1 for executor '1' of framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:26.112584 2527 slave.cpp:2215] Handling status update TASK_RUNNING (UUID: 61e6f703-ae25-4e31-88a7-0464b8bd8249) for task 1 of framework 20150322-040336-606645514-5050-2744-0037 from executor(1)@10.217.7.180:52410 I0329 20:34:26.112751 2527 status_update_manager.cpp:317] Received status update TASK_RUNNING (UUID: 61e6f703-ae25-4e31-88a7-0464b8bd8249) for task 1 of framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:26.113052 2527 slave.cpp:2458] Forwarding the update TASK_RUNNING (UUID: 61e6f703-ae25-4e31-88a7-0464b8bd8249) for task 1 of framework 20150322-040336-606645514-5050-2744-0037 to master@10.173.40.36:5050 I0329 20:34:26.113131 2527 slave.cpp:2391] Sending acknowledgement for status update TASK_RUNNING (UUID: 61e6f703-ae25-4e31-88a7-0464b8bd8249) for task 1 of framework 20150322-040336-606645514-5050-2744-0037 to executor(1)@10.217.7.180:52410 I0329 20:34:26.115972 2527 status_update_manager.cpp:389] Received status update acknowledgement (UUID: 61e6f703-ae25-4e31-88a7-0464b8bd8249) for task 1 of framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:26.214292 2530 slave.cpp:2215] Handling status update TASK_FAILED (UUID: f91beb0f-3099-4313-97b7-25f7ff69913c) for task 1 of framework 20150322-040336-606645514-5050-2744-0037 from executor(1)@10.217.7.180:52410 I0329 20:34:26.215005 2526 status_update_manager.cpp:317] Received status update TASK_FAILED (UUID: f91beb0f-3099-4313-97b7-25f7ff69913c) for task 1 of framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:26.215144 2526 slave.cpp:2458] Forwarding the update TASK_FAILED (UUID: f91beb0f-3099-4313-97b7-25f7ff69913c) for task 1 of framework 20150322-040336-606645514-5050-2744-0037 to master@10.173.40.36:5050 I0329 20:34:26.215277 2526 slave.cpp:2391] Sending acknowledgement for status update TASK_FAILED (UUID: f91beb0f-3099-4313-97b7-25f7ff69913c) for task 1 of framework 20150322-040336-606645514-5050-2744-0037 to executor(1)@10.217.7.180:52410 I0329 20:34:26.222218 2524 status_update_manager.cpp:389] Received status update acknowledgement (UUID: f91beb0f-3099-4313-97b7-25f7ff69913c) for task 1 of framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:26.239357 2524 slave.cpp:1083] Got assigned task 4 for framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:26.239853 2524 slave.cpp:1193] Launching task 4 for framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:26.240880 2524 slave.cpp:3997] Launching executor 4 of framework 20150322-040336-606645514-5050-2744-0037 in work directory '/tmp/mesos/slaves/20150322-040336-606645514-5050-2744-S1/frameworks/20150322-040336-606645514-5050-2744-0037/executors/4/runs/e3cf195d-525b-4148-aa38-1789d378a948' I0329 20:34:26.241065 2524 slave.cpp:1316] Queuing task '4' for executor 4 of framework '20150322-040336-606645514-5050-2744-0037 I0329 20:34:26.241554 2528 containerizer.cpp:424] Starting container 'e3cf195d-525b-4148-aa38-1789d378a948' for executor '4' of framework '20150322-040336-606645514-5050-2744-0037' I0329 20:34:26.292538 2527 slave.cpp:2840] Monitoring executor '4' of framework '20150322-040336-606645514-5050-2744-0037' in container 'e3cf195d-525b-4148-aa38-1789d378a948' I0329 20:34:26.313694 2527 slave.cpp:1860] Got registration for executor '4' of framework 20150322-040336-606645514-5050-2744-0037 from executor(1)@10.217.7.180:55646 I0329 20:34:26.314398 2527 slave.cpp:1979] Flushing queued task 4 for executor '4' of framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:26.324579 2531 slave.cpp:2215] Handling status update TASK_RUNNING (UUID: 0a6624b9-74a2-44df-b1e9-007d89602e68) for task 4 of framework 20150322-040336-606645514-5050-2744-0037 from executor(1)@10.217.7.180:55646 I0329 20:34:26.324774 2527 status_update_manager.cpp:317] Received status update TASK_RUNNING (UUID: 0a6624b9-74a2-44df-b1e9-007d89602e68) for task 4 of framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:26.325001 2531 slave.cpp:2458] Forwarding the update TASK_RUNNING (UUID: 0a6624b9-74a2-44df-b1e9-007d89602e68) for task 4 of framework 20150322-040336-606645514-5050-2744-0037 to master@10.173.40.36:5050 I0329 20:34:26.325150 2531 slave.cpp:2391] Sending acknowledgement for status update TASK_RUNNING (UUID: 0a6624b9-74a2-44df-b1e9-007d89602e68) for task 4 of framework 20150322-040336-606645514-5050-2744-0037 to executor(1)@10.217.7.180:55646 I0329 20:34:26.328096 2529 status_update_manager.cpp:389] Received status update acknowledgement (UUID: 0a6624b9-74a2-44df-b1e9-007d89602e68) for task 4 of framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:26.425070 2526 slave.cpp:2215] Handling status update TASK_FAILED (UUID: e4a656cb-1be6-4875-b2bd-e2d756c78c11) for task 4 of framework 20150322-040336-606645514-5050-2744-0037 from executor(1)@10.217.7.180:55646 I0329 20:34:26.425870 2526 status_update_manager.cpp:317] Received status update TASK_FAILED (UUID: e4a656cb-1be6-4875-b2bd-e2d756c78c11) for task 4 of framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:26.426008 2526 slave.cpp:2458] Forwarding the update TASK_FAILED (UUID: e4a656cb-1be6-4875-b2bd-e2d756c78c11) for task 4 of framework 20150322-040336-606645514-5050-2744-0037 to master@10.173.40.36:5050 I0329 20:34:26.426118 2526 slave.cpp:2391] Sending acknowledgement for status update TASK_FAILED (UUID: e4a656cb-1be6-4875-b2bd-e2d756c78c11) for task 4 of framework 20150322-040336-606645514-5050-2744-0037 to executor(1)@10.217.7.180:55646 I0329 20:34:26.429636 2528 status_update_manager.cpp:389] Received status update acknowledgement (UUID: e4a656cb-1be6-4875-b2bd-e2d756c78c11) for task 4 of framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:27.306196 2528 slave.cpp:2898] Executor '1' of framework 20150322-040336-606645514-5050-2744-0037 exited with status 0 I0329 20:34:27.306296 2528 slave.cpp:3007] Cleaning up executor '1' of framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:27.306550 2531 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20150322-040336-606645514-5050-2744-S1/frameworks/20150322-040336-606645514-5050-2744-0037/executors/1/runs/79cf96ba-bf58-45cd-927b-f6c864f6e44b' for gc 6.99999645247704days in the future I0329 20:34:27.306653 2531 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20150322-040336-606645514-5050-2744-S1/frameworks/20150322-040336-606645514-5050-2744-0037/executors/1' for gc 6.99999645160889days in the future I0329 20:34:27.503298 2524 slave.cpp:2898] Executor '4' of framework 20150322-040336-606645514-5050-2744-0037 exited with status 0 I0329 20:34:27.503384 2524 slave.cpp:3007] Cleaning up executor '4' of framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:27.503510 2526 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20150322-040336-606645514-5050-2744-S1/frameworks/20150322-040336-606645514-5050-2744-0037/executors/4/runs/e3cf195d-525b-4148-aa38-1789d378a948' for gc 6.99999417290667days in the future I0329 20:34:27.503553 2524 slave.cpp:3084] Cleaning up framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:27.503566 2526 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20150322-040336-606645514-5050-2744-S1/frameworks/20150322-040336-606645514-5050-2744-0037/executors/4' for gc 6.99999417236148days in the future I0329 20:34:27.503608 2526 status_update_manager.cpp:279] Closing status update streams for framework 20150322-040336-606645514-5050-2744-0037 I0329 20:34:27.503638 2524 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20150322-040336-606645514-5050-2744-S1/frameworks/20150322-040336-606645514-5050-2744-0037' for gc 6.99999417116741days in the future I0329 20:35:50.453316 2526 slave.cpp:1533] Asked to shut down framework 20150322-040336-606645514-5050-2744-0037 by master@10.173.40.36:5050 W0329 20:35:50.453419 2526 slave.cpp:1548] Cannot shut down unknown framework 20150322-040336-606645514-5050-2744-0037 I0329 20:39:26.006376 2530 slave.cpp:3237] Framework 20150322-040336-606645514-5050-2744-0037 seems to have exited. Ignoring registration timeout for executor '1' I0329 20:39:26.241459 2524 slave.cpp:3237] Framework 20150322-040336-606645514-5050-2744-0037 seems to have exited. Ignoring registration timeout for executor '4' $ cat mesos-slave.WARNING | grep 20150322-040336-606645514-5050-2744-0037 W0329 20:35:50.453419 2526 slave.cpp:1548] Cannot shut down unknown framework 20150322-040336-606645514-5050-2744-0037 There's nothing in mesos-slave.ERROR for this framework ID. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-t-run-spark-submit-with-an-application-jar-on-a-Mesos-cluster-tp22277p22282.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org