Re: Support for multiple HDFS

2017-09-24 Thread Haohui Mai
You can definitely use absolute URIs to access two clusters. The configuration just has to be the union of multiple HDFS clusters (e.g., the NameNode lists) Accessing both secure and non-secure clusters are fairly tricky but it can be done. AFAIK Isolating the Hadoop configuration will require a

Re: Support for multiple HDFS

2017-08-24 Thread Vijay Srinivasaraghavan
I think it may not work in scenario where Hadoop security is enabled and each HCFS setup is configured differently, unless if there is a way to isolate the Hadoop configurations used in this case? Regards, Vijay Sent from my iPhone > On Aug 24, 2017, at 2:51 AM, Stephan Ewen wrote: > > Hi! >

Re: Support for multiple HDFS

2017-08-24 Thread Stephan Ewen
Hi! I think it can work if you fully qualify the URIs. For the checkpoint configuration, specify one namenode (in the flink-config.yml or in the constructor of the state backend). Example: statebackend.fs.checkpoint.dir: hdfs://dfsOneNamenode:port/flink/checkpoints For the result (for example

Re: Support for multiple HDFS

2017-08-24 Thread Stefan Richter
Hi, I don’t think that this is currently supported. If you see a use case for this (over creating different root directories for checkpoint data and result data) then I suggest that you open a JIRA issue with a new feature request. Best, Stefan > Am 23.08.2017 um 20:17 schrieb Vijay Srinivasar

Re: Support for multiple HDFS

2017-08-23 Thread Vijay Srinivasaraghavan
Hi Ted, I believe HDFS-6584 is more of an HDFS feature supporting archive use case through some policy configurations. My ask is that I have two distinct HCFS File systems which are independent but the Flink job will decide which one to use for sink while the Flink infrastructure is by default c

Re: Support for multiple HDFS

2017-08-23 Thread Ted Yu
Would HDFS-6584 help with your use case ? On Wed, Aug 23, 2017 at 11:00 AM, Vijay Srinivasaraghavan < vijikar...@yahoo.com.invalid> wrote: > Hello, > Is it possible for a Flink cluster to use multiple HDFS repository (HDFS-1 > for managing Flink state backend, HDFS-2 for syncing results from user

Support for multiple HDFS

2017-08-23 Thread Vijay Srinivasaraghavan
Hello, Is it possible for a Flink cluster to use multiple HDFS repository (HDFS-1 for managing Flink state backend, HDFS-2 for syncing results from user job)?  The scenario can be viewed in the context of running some jobs that are meant to push the results to an archive repository (cold storage)