I am trying to run Spark applications with the driver running locally and interacting with a firewalled remote cluster via a SOCKS proxy.
I have to modify the hadoop configuration on the *local machine* to try to make this work, adding <property> <name>hadoop.rpc.socket.factory.class.default</name> <value>org.apache.hadoop.net.SocksSocketFactory</value> </property> <property> <name>hadoop.socks.server</name> <value>localhost:9998</value> </property> and on the *remote cluster* side <property> <name>hadoop.rpc.socket.factory.class.default</name> <value>org.apache.hadoop.net.StandardSocketFactory</value> <final>true</final> </property> With this setup, and running "ssh -D 9998 gateway.host" to start the proxy connection, MapReduce jobs started on the local machine execute fine on the remote cluster. However, trying to launch a Spark job fails with the nodes of the cluster apparently unable to communicate with one another: java.io.IOException: Failed on local exception: java.net.SocketException: Connection refused; Host Details : local host is: "node3/10.211.55.103"; destination host is: "node1":8030; Looking at the packets being sent to node1 from node3, it's clear that no requests are made on port 8030, hinting that the connection is somehow being proxied. Is it possible that the Spark job is not honoring the socket.factory settings on the *cluster* side for some reason? Note that Spark JIRA 5004 <https://issues.apache.org/jira/browse/SPARK-5004> addresses a similar problem, though it looks like they are actually not the same (since in that case it sounds like a standalone cluster is being used). -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/problems-running-Spark-on-a-firewalled-remote-YARN-cluster-via-SOCKS-proxy-tp23955.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org