Hi Richard,

Good suggestion. I just created a Jira ticket. I will find a time this week
to update docs.



Best Regards
Peter Huang

On Wed, Sep 25, 2019 at 8:05 AM Richard Deurwaarder <rich...@xeli.eu> wrote:

> Hi Peter and Jiayi,
>
> Thanks for the answers this worked perfectly, I just added
>
> containerized.master.env.GOOGLE_APPLICATION_CREDENTIALS=xyz
> and
> containerized.taskmanager.env.GOOGLE_APPLICATION_CREDENTIALS=xyz
>
> to my flink config and they got picked up.
>
> Do you know why this is missing from the docs? If it's not intentional it
> might be nice to add it.
>
> Richard
>
> On Tue, Sep 24, 2019 at 5:53 PM Peter Huang <huangzhenqiu0...@gmail.com>
> wrote:
>
>> Hi Richard,
>>
>> For the first question, I don't think you need to explicitly specify
>> fs.hdfs.hadoopconf as each file in the ship folder is copied as a yarn
>> local resource for containers. The configuration path is
>> overridden internally in Flink.
>>
>> For the second question of setting TM environment variables, please use
>> these two configurations in your flink conf.
>>
>> /**
>>  * Prefix for passing custom environment variables to Flink's master process.
>>  * For example for passing LD_LIBRARY_PATH as an env variable to the 
>> AppMaster, set:
>>  * containerized.master.env.LD_LIBRARY_PATH: "/usr/lib/native"
>>  * in the flink-conf.yaml.
>>  */
>> public static final String CONTAINERIZED_MASTER_ENV_PREFIX = 
>> "containerized.master.env.";
>>
>> /**
>>  * Similar to the {@see CONTAINERIZED_MASTER_ENV_PREFIX}, this configuration 
>> prefix allows
>>  * setting custom environment variables for the workers (TaskManagers).
>>  */
>> public static final String CONTAINERIZED_TASK_MANAGER_ENV_PREFIX = 
>> "containerized.taskmanager.env.";
>>
>>
>>
>> Best Regards
>>
>> Peter Huang
>>
>>
>>
>>
>> On Tue, Sep 24, 2019 at 8:02 AM Richard Deurwaarder <rich...@xeli.eu>
>> wrote:
>>
>>> Hello,
>>>
>>> We have our flink job (1.8.0) running on our hadoop 2.7 cluster with
>>> yarn. We would like to add the GCS connector to use GCS rather than HDFS.
>>> Following the documentation of the GCS connector[1] we have to specify
>>> which credentials we want to use and there are two ways of doing this:
>>>   * Edit core-site.xml
>>>   * Set an environment variable: GOOGLE_APPLICATION_CREDENTIALS
>>>
>>> Because we're on a company shared hadoop cluster we do not want to
>>> change the cluster wide core-site.xml.
>>>
>>> This leaves me with two options:
>>>
>>> 1. Create a custom core-site.xml and use --yarnship to send it to all
>>> the taskmanager contains. If I do this, to what value should I set
>>> fs.hdfs.hadoopconf[2] in flink-conf ?
>>> 2. The second option would be to set an environment variable, however
>>> because the taskmanagers are started via yarn I'm having trouble figuring
>>> out how to make sure this environment variable is set for each yarn
>>> container / taskmanager.
>>>
>>> I would appreciate any help you can provide.
>>>
>>> Thank you,
>>>
>>> Richard
>>>
>>> [1]
>>> https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcs/INSTALL.md#configure-hadoop
>>> [2]
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/config.html#hdfs
>>>
>>

Reply via email to