https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html You don't have to rely on single NN. You can specify a kind of "NN HA alias" and underlying HDFS client would connect to NN which is active right now. Thanks for pointing HADOOP_CONF_DIR, seems like it's the thing I need.
2017-03-26 14:31 GMT+02:00 Jianfeng (Jeff) Zhang <[email protected]>: > > What do you mean non-reliable ? If you want to read/write 2 hadoop cluster > in one program, I am afraid this is the only way. It is impossible to > specify multiple HADOOP_CONF_DIR under one jvm classpath. Only one > default configuration will be used. > > > Best Regard, > Jeff Zhang > > > From: Serega Sheypak <[email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Sunday, March 26, 2017 at 7:47 PM > To: "[email protected]" <[email protected]> > Subject: Re: Setting Zeppelin to work with multiple Hadoop clusters when > running Spark. > > I know it, thanks, but it's non reliable solution. > > 2017-03-26 5:23 GMT+02:00 Jianfeng (Jeff) Zhang <[email protected]>: > >> >> You can try to specify the namenode address for hdfs file. e.g >> >> spark.read.csv(“hdfs://localhost:9009/file”) >> >> Best Regard, >> Jeff Zhang >> >> >> From: Serega Sheypak <[email protected]> >> Reply-To: "[email protected]" <[email protected]> >> Date: Sunday, March 26, 2017 at 2:47 AM >> To: "[email protected]" <[email protected]> >> Subject: Setting Zeppelin to work with multiple Hadoop clusters when >> running Spark. >> >> Hi, I have three hadoop clusters. Each cluster has it's own NN HA >> configured and YARN. >> I want to allow user to read from ant cluster and write to any cluster. >> Also user should be able to choose where to run is spark job. >> What is the right way to configure it in Zeppelin? >> >> >
