Hi,

We have a cluster with 4 nodes. The cluster uses CDH 5.4 for the past two
days I have been trying to connect my laptop to the server using spark
<master ip:port> but its been unsucessful.
The server contains data that needs to be cleaned and analysed.
The cluster and the nodes are on linux environment.
To connect to the nodes I am usnig SSH

Question: Would it be better if I work directly on the nodes rather than
trying to connect my laptop to them ?
Question 2: If yes, then can you suggest any python and R IDE that I can
install on the nodes to make it work?

Thanks for your help


Sincerely,
Ashish Dutt

Reply via email to