Hi Spark users,

could someone help me out.

My company has a fully functioning spark cluster with shark running on
top of it (as part of the same cluster, on the same LAN) . I'm
interested in running raw spark code against it but am running against
the following issue -- it seems like the machine hosting the driver
program needs to be reachable by the worker nodes (in my case the
workers cannot route to the machine hosting the driver). Below is a
snippet from my worker log:

14/03/03 20:45:28 INFO executor.StandaloneExecutorBackend: Connecting
to driver: akka://spark@driver_ip:49081/user/StandaloneScheduler
14/03/03 20:45:29 ERROR executor.StandaloneExecutorBackend: Driver
terminated or disconnected! Shutting down.

Does this sound right -- it's not clear to me why a worker would try
to establish a connection to the driver -- the driver already
connected successfully as I see the program listed in the log....why
is this connection not sufficient?

If you use Amazon EC2, can you run the driver from your personal
machine or do have to install an IDE on one of Amazon machines in
order to debug code? I am not too excited about the EC2 option as our
data is proprietary...but if that's the shortest path to success at
least it would get me started on some toy examples. At the moment I'm
not sure what my options are, other than running a VM cluster or EC2

Any help/insight would be greatly appreciated.

Reply via email to