Hello Akhil, Thanks for the response. I will have to figure this out.
Sincerely, Ashish On Thu, Jul 9, 2015 at 3:40 PM, Akhil Das <ak...@sigmoidanalytics.com> wrote: > On Wed, Jul 8, 2015 at 7:31 PM, Ashish Dutt <ashish.du...@gmail.com> > wrote: > >> Hi, >> >> We have a cluster with 4 nodes. The cluster uses CDH 5.4 for the past two >> days I have been trying to connect my laptop to the server using spark >> <master ip:port> but its been unsucessful. >> The server contains data that needs to be cleaned and analysed. >> The cluster and the nodes are on linux environment. >> To connect to the nodes I am usnig SSH >> >> Question: Would it be better if I work directly on the nodes rather than >> trying to connect my laptop to them ? >> > > -> You will be able to connect to master machine in the cloud from your > laptop > > , but you need to make sure that the master is able to connect back to > your laptop (may require port forwarding on your router, firewalls etc.) > > > >> Question 2: If yes, then can you suggest any python and R IDE that I can >> install on the nodes to make it work? >> > > -> Once the master machine is able to connect to your laptop's public ip, > then you can set the spark.driver.host and spark.driver.port properties and > your job will get executed on the cluster. > > > >> >> Thanks for your help >> >> >> Sincerely, >> Ashish Dutt >> >> >