Re: Providing hdfs name node IP for streaming file sink

2020-03-03 Thread Vishwas Siravara
Thanks Yang. Going with setting the HADOOP_CONF_DIR in the flink application. It integrates neatly with flink. Best, Nick. On Mon, Mar 2, 2020 at 7:42 PM Yang Wang wrote: > It may work. However, you need to set your own retry policy(similar as > `ConfiguredFailoverProxyProvider` in hadoop). > A

Re: Providing hdfs name node IP for streaming file sink

2020-03-02 Thread Yang Wang
It may work. However, you need to set your own retry policy(similar as `ConfiguredFailoverProxyProvider` in hadoop). Also if you directly use namenode address and do not load HDFS configuration, some HDFS client configuration (e.g. dfs.client.*) will not take effect. Best, Yang Nick Bendtner 于2

Re: Providing hdfs name node IP for streaming file sink

2020-03-02 Thread Nick Bendtner
Thanks a lot Yang. What are your thoughts on catching the exception when a name node is down and retrying with the secondary name node ? Best, Nick. On Sun, Mar 1, 2020 at 9:05 PM Yang Wang wrote: > Hi Nick, > > Certainly you could directly use "namenode:port" as the schema of you HDFS > path.

Re: Providing hdfs name node IP for streaming file sink

2020-03-01 Thread Yang Wang
Hi Nick, Certainly you could directly use "namenode:port" as the schema of you HDFS path. Then the hadoop configs(e.g. core-site.xml, hdfs-site.xml) will not be necessary. However, that also means you could benefit from the HDFS high-availability[1]. If your HDFS cluster is HA configured, i stron

Re: Providing hdfs name node IP for streaming file sink

2020-02-28 Thread Nick Bendtner
To add to this question, do I need to setup env.hadoop.conf.dir to point to the hadoop config for instance env.hadoop.conf.dir=/etc/hadoop/ for the jvm ? Or is it possible to write to hdfs without any external hadoop config like core-site.xml, hdfs-site.xml ? Best, Nick. On Fri, Feb 28, 2020 at

Providing hdfs name node IP for streaming file sink

2020-02-28 Thread Nick Bendtner
Hi guys, I am trying to write to hdfs from streaming file sink. Where should I provide the IP address of the name node ? Can I provide it as a part of the flink-config.yaml file or should I provide it like this : final StreamingFileSink sink = StreamingFileSink .forBulkFormat(hdfs://nameno