Re: Support for multiple HDFS

Vijay Srinivasaraghavan Thu, 24 Aug 2017 05:30:58 -0700

I think it may not work in scenario where Hadoop security is enabled and each 
HCFS setup is configured differently, unless if there is a way to isolate the 
Hadoop configurations used in this case?


Regards,
Vijay

Sent from my iPhone

> On Aug 24, 2017, at 2:51 AM, Stephan Ewen <se...@apache.org> wrote:
> 
> Hi!
> 
> I think it can work if you fully qualify the URIs.
> 
> For the checkpoint configuration, specify one namenode (in the
> flink-config.yml or in the constructor of the state backend).
> Example:   statebackend.fs.checkpoint.dir:
> hdfs://dfsOneNamenode:port/flink/checkpoints
> 
> For the result (for example rolling sink), configure it with
> hdfs://dfsTwoNamenode:otherport/flink/result
> 
> Is that what you are looking for?
> 
> Stephan
> 
> 
> On Thu, Aug 24, 2017 at 11:47 AM, Stefan Richter <
> s.rich...@data-artisans.com> wrote:
> 
>> Hi,
>> 
>> I don’t think that this is currently supported. If you see a use case for
>> this (over creating different root directories for checkpoint data and
>> result data) then I suggest that you open a JIRA issue with a new feature
>> request.
>> 
>> Best,
>> Stefan
>> 
>>> Am 23.08.2017 um 20:17 schrieb Vijay Srinivasaraghavan <
>> vijikar...@yahoo.com>:
>>> 
>>> Hi Ted,
>>> 
>>> I believe HDFS-6584 is more of an HDFS feature supporting archive use
>> case through some policy configurations.
>>> 
>>> My ask is that I have two distinct HCFS File systems which are
>> independent but the Flink job will decide which one to use for sink while
>> the Flink infrastructure is by default configured with one of these HCFS as
>> state backend store.
>>> 
>>> Hope this helps.
>>> 
>>> Regards
>>> Vijay
>>> 
>>> 
>>> On Wednesday, August 23, 2017 11:06 AM, Ted Yu <yuzhih...@gmail.com>
>> wrote:
>>> 
>>> 
>>> Would HDFS-6584 help with your use case ?
>>> 
>>> On Wed, Aug 23, 2017 at 11:00 AM, Vijay Srinivasaraghavan <
>>> vijikar...@yahoo.com.invalid <mailto:vijikar...@yahoo.com.invalid>>
>> wrote:
>>> 
>>>> Hello,
>>>> Is it possible for a Flink cluster to use multiple HDFS repository
>> (HDFS-1
>>>> for managing Flink state backend, HDFS-2 for syncing results from user
>>>> job)?
>>>> The scenario can be viewed in the context of running some jobs that are
>>>> meant to push the results to an archive repository (cold storage).
>>>> Since the hadoop configuration is static, I am thinking it is hard to
>>>> achieve this but I could be wrong.
>>>> Please share any thoughts.
>>>> RegardsVijay
>>> 
>>> 
>> 
>>

Re: Support for multiple HDFS

Reply via email to