Thanks a lot Yang. What are your thoughts on catching the exception when a
name node is down and retrying with the secondary name node ?

Best,
Nick.

On Sun, Mar 1, 2020 at 9:05 PM Yang Wang <danrtsey...@gmail.com> wrote:

> Hi Nick,
>
> Certainly you could directly use "namenode:port" as the schema of you HDFS
> path.
> Then the hadoop configs(e.g. core-site.xml, hdfs-site.xml) will not be
> necessary.
> However, that also means you could benefit from the HDFS
> high-availability[1].
>
> If your HDFS cluster is HA configured, i strongly suggest you to set the
> "HADOOP_CONF_DIR"
> for your Flink application. Both the client and cluster(JM/TM) side need
> to be set. Then
> your HDFS path could be specified like this "hdfs://myhdfs/flink/test".
> Given that "myhdfs"
> is the name service configured in hdfs-site.xml.
>
>
> Best,
> Yang
>
>
>
> [1].
> http://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
>
> Nick Bendtner <buggi...@gmail.com> 于2020年2月29日周六 上午6:00写道:
>
>> To add to this question, do I need to setup env.hadoop.conf.dir to point
>> to the hadoop config for instance env.hadoop.conf.dir=/etc/hadoop/ for
>> the jvm ? Or is it possible to write to hdfs without any external hadoop
>> config like core-site.xml, hdfs-site.xml ?
>>
>> Best,
>> Nick.
>>
>>
>>
>> On Fri, Feb 28, 2020 at 12:56 PM Nick Bendtner <buggi...@gmail.com>
>> wrote:
>>
>>> Hi guys,
>>> I am trying to write to hdfs from streaming file sink. Where should I
>>> provide the IP address of the name node ? Can I provide it as a part of the
>>> flink-config.yaml file or should I provide it like this :
>>>
>>> final StreamingFileSink<GenericRecord> sink = StreamingFileSink
>>>     .forBulkFormat(hdfs://namenode:8020/flink/test, 
>>> ParquetAvroWriters.forGenericRecord(schema))
>>>
>>>     .build();
>>>
>>>
>>> Best,
>>> Nick
>>>
>>>
>>>

Reply via email to