Hi Prasanna Santhanam As far as I know, there is no cluster-ssh interpreter Zeppelin provides.(If not, please someone let me know)
In my case, I use *clusterssh(cssh).* The screenshot below is it.(Copied from the Internet) There is another tool called parallel-ssh(pssh), but I prefer cssh. Since I can watch every node's output. Or, maybe you can consider building *NFS(Network File System). *So that every node has same Python environment. But actually, the two solutions above have a lot to do. Is there any other way just using PySpark features? Please help if there is someone knows. By the way, I think cluster-ssh interpreter is a cool feature. 2016년 10월 22일 (토) 오후 12:31, Prasanna Santhanam <t...@apache.org>님이 작성: > Hello All, > > I've been using Apache Zeppelin against Apache Spark clusters and with > PySpark. One of the things I often tend to do is install libraries and > packages on my cluster. For instance I would like numpy, scipy and other > data science libraries present on my cluster for data analysis. However, > the %sh interpreter only works on my Zeppelin host for any pip install > commands. > > - How are other users tackling this problem? > - Do you have a base set of libraries always installed? > - Is there a clustered shell interpreter over SSH that Apache Zeppelin > provides? > *(*I looked but didn't find any issues/pull requests related to this ask*)* > > Thanks, > -- Taejun Kim Data Mining Lab. School of Electrical and Computer Engineering University of Seoul