Re: Flink-Yarn-Kerberos integration

Shuyi Chen Wed, 17 Jan 2018 19:30:06 -0800

Ping, any comments?  Thanks a lot.

Shuyi


On Wed, Jan 3, 2018 at 3:43 PM, Shuyi Chen <suez1...@gmail.com> wrote:

> Thanks a lot for the clarification, Eron. That's very helpful. Currently,
> we are more concerned about 1) data access, but will get to 2) and 3)
> eventually.
>
> I was thinking doing the following:
> 1) extend the current HadoopModule to use and refresh DTs as suggested on YARN
> Application Security docs.
> 2) I found the current SecurityModule interface might be enough for
> supporting other security mechanisms. However, the loading of security
> modules are hard-coded, not configuration based. I think we can extend
> SecurityUtils to load from configurations. So we can implement our own
> security mechanism in our internal repo, and have flink jobs to load it at
> runtime.
>
> Please let me know your comments. Thanks a lot.
>
> On Fri, Dec 22, 2017 at 3:05 PM, Eron Wright <eronwri...@gmail.com> wrote:
>
>> I agree that it is reasonable to use Hadoop DTs as you describe.  That
>> approach is even recommended in YARN's documentation (see Securing
>> Long-lived YARN Services on the YARN Application Security page).   But one
>> of the goals of Kerberos integration is to support Kerberized data access
>> for connectors other than HDFS, such as Kafka, Cassandra, and
>> Elasticsearch.   So your second point makes sense too, suggesting a
>> general
>> architecture for managing secrets (DTs, keytabs, certificates, oauth
>> tokens, etc.) within the cluster.
>>
>> There's quite a few aspects to Flink security, including:
>> 1. data access (e.g. how a connector authenticates to a data source)
>> 2. service authorization and network security (e.g. how a Flink cluster
>> protects itself from unauthorized access)
>> 3. multi-user support (e.g. multi-user Flink clusters, RBAC)
>>
>> I mention these aspects to clarify your point about AuthN, which I took to
>> be related to (1).   Do tell if I misunderstood.
>>
>> Eron
>>
>>
>> On Wed, Dec 20, 2017 at 11:21 AM, Shuyi Chen <suez1...@gmail.com> wrote:
>>
>> > Hi community,
>> >
>> > We are working on secure Flink on YARN. The current Flink-Yarn-Kerberos
>> > integration will require each container of a job to log in Kerberos via
>> > keytab every say, 24 hours, and does not use any Hadoop delegation token
>> > mechanism except when localizing the container. As I fixed the current
>> > Flink-Yarn-Kerberos (FLINK-8275) and tried to add more
>> > features(FLINK-7860), I have some concern regarding the current
>> > implementation. It can pose a scalability issue to the KDC, e.g., if
>> YARN
>> > cluster is restarted and all 10s of thousands of containers suddenly
>> DDOS
>> > KDC.
>> >
>> > I would like to propose to improve the current Flink-YARN-Kerberos
>> > integration by doing something like the following:
>> > 1) AppMaster (JobManager) periodically authenticate the KDC, get all
>> > required DTs for the job.
>> > 2) all other TM or TE containers periodically retrieve new DTs from the
>> > AppMaster (either through a secure HDFS folder, or a secure Akka
>> channel)
>> >
>> > Also, we want to extend Flink to support pluggable AuthN mechanism,
>> because
>> > we have our own internal AuthN mechanism. We would like add support in
>> > Flink to authenticate periodically to our internal AuthN service as well
>> > through, e.g., dynamic class loading, and use similar mechanism to
>> > distribute the credential from the appMaster to containers.
>> >
>> > I would like to get comments and feedbacks. I can also write a design
>> doc
>> > or create a Flip if needed. Thanks a lot.
>> >
>> > Shuyi
>> >
>> >
>> >
>> > --
>> > "So you have to trust that the dots will somehow connect in your
>> future."
>> >
>>
>
>
>
> --
> "So you have to trust that the dots will somehow connect in your future."
>



-- 
"So you have to trust that the dots will somehow connect in your future."

Re: Flink-Yarn-Kerberos integration

Reply via email to