Re: [DISCUSS] FLIP-211: Kerberos delegation token framework

Gabor Somogyi Mon, 31 Jan 2022 02:29:20 -0800

Not sure if the mentioned write right already given or not but I still
don't see any edit button.


G


On Fri, Jan 28, 2022 at 5:08 PM Gabor Somogyi <[email protected]>
wrote:

> Hi Robert,
>
> That would be awesome.
>
> My cwiki username: gaborgsomogyi
>
> G
>
>
> On Fri, Jan 28, 2022 at 5:06 PM Robert Metzger <[email protected]>
> wrote:
>
>> Hey Gabor,
>>
>> let me know your cwiki username, and I can give you write permissions.
>>
>>
>> On Fri, Jan 28, 2022 at 4:05 PM Gabor Somogyi <[email protected]>
>> wrote:
>>
>> > Thanks for making the design better! No further thing to discuss from my
>> > side.
>> >
>> > Started to reflect the agreement in the FLIP doc.
>> > Since I don't have access to the wiki I need to ask Marci to do that
>> which
>> > may take some time.
>> >
>> > G
>> >
>> >
>> > On Fri, Jan 28, 2022 at 3:52 PM David Morávek <[email protected]> wrote:
>> >
>> > > Hi,
>> > >
>> > > AFAIU an under registration TM is not added to the registered TMs map
>> > until
>> > > > RegistrationResponse ..
>> > > >
>> > >
>> > > I think you're right, with a careful design around threading
>> (delegating
>> > > update broadcasts to the main thread) + synchronous initial update
>> (that
>> > > would be nice to avoid) this should be doable.
>> > >
>> > > Not sure what you mean "we can't register the TM without providing it
>> > with
>> > > > token" but in unsecure configuration registration must happen w/o
>> > tokens.
>> > > >
>> > >
>> > > Exactly as you describe it, this was meant only for the "kerberized /
>> > > secured" cluster case, in other cases we wouldn't enforce a non-null
>> > token
>> > > in the response
>> > >
>> > > I think this is a good idea in general.
>> > > >
>> > >
>> > > +1
>> > >
>> > > If you don't have any more thoughts on the RPC / lifecycle part, can
>> you
>> > > please reflect it into the FLIP?
>> > >
>> > > D.
>> > >
>> > > On Fri, Jan 28, 2022 at 3:16 PM Gabor Somogyi <
>> [email protected]
>> > >
>> > > wrote:
>> > >
>> > > > > - Make sure DTs issued by single DTMs are monotonically increasing
>> > (can
>> > > > be
>> > > > sorted on TM side)
>> > > >
>> > > > AFAIU an under registration TM is not added to the registered TMs
>> map
>> > > until
>> > > > RegistrationResponse
>> > > > is processed which would contain the initial tokens. If that's true
>> > then
>> > > > how is it possible to have race with
>> > > > DTM update which is working on the registered TMs list?
>> > > > To be more specific "taskExecutors" is the registered map of TMs to
>> > which
>> > > > DTM can send updated tokens
>> > > > but this doesn't contain the under registration TM while
>> > > > RegistrationResponse is not processed, right?
>> > > >
>> > > > Of course if DTM can update while RegistrationResponse is processed
>> > then
>> > > > somehow sorting would be
>> > > > required and that case I would agree.
>> > > >
>> > > > - Scope DT updates by the RM ID and ensure that TM only accepts
>> update
>> > > from
>> > > > the current leader
>> > > >
>> > > > I've planned this initially the mentioned way so agreed.
>> > > >
>> > > > - Return initial token with the RegistrationResponse, which should
>> make
>> > > the
>> > > > RPC contract bit clearer (ensure that we can't register the TM
>> without
>> > > > providing it with token)
>> > > >
>> > > > I think this is a good idea in general. Not sure what you mean "we
>> > can't
>> > > > register the TM without
>> > > > providing it with token" but in unsecure configuration registration
>> > must
>> > > > happen w/o tokens.
>> > > > All in all the newly added tokens field must be somehow optional.
>> > > >
>> > > > G
>> > > >
>> > > >
>> > > > On Fri, Jan 28, 2022 at 2:22 PM David Morávek <[email protected]>
>> wrote:
>> > > >
>> > > > > We had a long discussion with Chesnay about the possible edge
>> cases
>> > and
>> > > > it
>> > > > > basically boils down to the following two scenarios:
>> > > > >
>> > > > > 1) There is a possible race condition between TM registration (the
>> > > first
>> > > > DT
>> > > > > update) and token refresh if they happen simultaneously. Than the
>> > > > > registration might beat the refreshed token. This could be easily
>> > > > addressed
>> > > > > if DTs could be sorted (eg. by the expiration time) on the TM
>> side.
>> > In
>> > > > > other words, if there are multiple updates at the same time we
>> need
>> > to
>> > > > make
>> > > > > sure that we have a deterministic way of choosing the latest one.
>> > > > >
>> > > > > One idea by Chesnay that popped up during this discussion was
>> whether
>> > > we
>> > > > > could simply return the initial token with the
>> RegistrationResponse
>> > to
>> > > > > avoid making an extra call during the TM registration.
>> > > > >
>> > > > > 2) When the RM leadership changes (eg. because zookeeper session
>> > times
>> > > > out)
>> > > > > there might be a race condition where the old RM is shutting down
>> and
>> > > > > updates the tokens, that it might again beat the registration
>> token
>> > of
>> > > > the
>> > > > > new RM. This could be avoided if we scope the token by
>> > > > _ResourceManagerId_
>> > > > > and only accept updates for the current leader (basically we'd
>> have
>> > an
>> > > > > extra parameter to the _updateDelegationToken_ method).
>> > > > >
>> > > > > -
>> > > > >
>> > > > > DTM is way simpler then for example slot management, which could
>> > > receive
>> > > > > updates from the JobMaster that RM might not know about.
>> > > > >
>> > > > > So if you want to go in the path you're describing it should be
>> > doable
>> > > > and
>> > > > > we'd propose following to cover all cases:
>> > > > >
>> > > > > - Make sure DTs issued by single DTMs are monotonically increasing
>> > (can
>> > > > be
>> > > > > sorted on TM side)
>> > > > > - Scope DT updates by the RM ID and ensure that TM only accepts
>> > update
>> > > > from
>> > > > > the current leader
>> > > > > - Return initial token with the RegistrationResponse, which should
>> > make
>> > > > the
>> > > > > RPC contract bit clearer (ensure that we can't register the TM
>> > without
>> > > > > providing it with token)
>> > > > >
>> > > > > Any thoughts?
>> > > > >
>> > > > >
>> > > > > On Fri, Jan 28, 2022 at 10:53 AM Gabor Somogyi <
>> > > > [email protected]>
>> > > > > wrote:
>> > > > >
>> > > > > > Thanks for investing your time!
>> > > > > >
>> > > > > > The first 2 bulletpoint are clear.
>> > > > > > If there is a chance that a TM can go to an inconsistent state
>> > then I
>> > > > > agree
>> > > > > > with the 3rd bulletpoint.
>> > > > > > Just before we agree on that I would like to learn something new
>> > and
>> > > > > > understand how is it possible that a TM
>> > > > > > gets corrupted? (In Spark I've never seen such thing and no
>> > mechanism
>> > > > to
>> > > > > > fix this but Flink is definitely not Spark)
>> > > > > >
>> > > > > > Here is my understanding:
>> > > > > > * DTM pushes new obtained DTs to TMs and if any exception occurs
>> > > then a
>> > > > > > retry after "security.kerberos.tokens.retry-wait"
>> > > > > > happens. This means DTM retries until it's not possible to send
>> new
>> > > DTs
>> > > > > to
>> > > > > > all registered TMs.
>> > > > > > * New TM registration must fail if "updateDelegationToken" fails
>> > > > > > * "updateDelegationToken" fails consistently like a DB (at
>> least I
>> > > plan
>> > > > > to
>> > > > > > implement it that way).
>> > > > > > If DTs are arriving on the TM side then a single
>> > > > > > "UserGroupInformation.getCurrentUser.addCredentials"
>> > > > > > will be called which I've never seen it failed.
>> > > > > > * I hope all other code parts are not touching existing DTs
>> within
>> > > the
>> > > > > JVM
>> > > > > >
>> > > > > > I would like to emphasize I'm not against to add it just want to
>> > see
>> > > > what
>> > > > > > kind of problems are we facing.
>> > > > > > It would ease to catch bugs earlier and help in the maintenance.
>> > > > > >
>> > > > > > All in all I would buy the idea to add the 3rd bullet if we
>> foresee
>> > > the
>> > > > > > need.
>> > > > > >
>> > > > > > G
>> > > > > >
>> > > > > >
>> > > > > > On Fri, Jan 28, 2022 at 10:07 AM David Morávek <[email protected]
>> >
>> > > > wrote:
>> > > > > >
>> > > > > > > Hi Gabor,
>> > > > > > >
>> > > > > > > This is definitely headed in a right direction +1.
>> > > > > > >
>> > > > > > > I think we still need to have a safeguard in case some of the
>> TMs
>> > > > gets
>> > > > > > into
>> > > > > > > the inconsistent state though, which will also eliminate the
>> need
>> > > for
>> > > > > > > implementing a custom retry mechanism (when
>> > _updateDelegationToken_
>> > > > > call
>> > > > > > > fails for some reason).
>> > > > > > >
>> > > > > > > We already have this safeguard in place for slot pool (in case
>> > > there
>> > > > > are
>> > > > > > > some slots in inconsistent state - eg. we haven't freed them
>> for
>> > > some
>> > > > > > > reason) and for the partition tracker, which could be simply
>> > > > enhanced.
>> > > > > > This
>> > > > > > > is done via periodic heartbeat from TaskManagers to the
>> > > > ResourceManager
>> > > > > > > that contains report about state of these two components
>> (from TM
>> > > > > > > perspective) so the RM can reconcile their state if necessary.
>> > > > > > >
>> > > > > > > I don't think adding an additional field to
>> > > > > > _TaskExecutorHeartbeatPayload_
>> > > > > > > should be a concern as we only heartbeat every ~ 10s by
>> default
>> > and
>> > > > the
>> > > > > > new
>> > > > > > > field would be small compared to rest of the existing payload.
>> > Also
>> > > > > > > heartbeat doesn't need to contain the whole DT, but just some
>> > > > > identifier
>> > > > > > > which signals whether it uses the right one, that could be
>> > > > > significantly
>> > > > > > > smaller.
>> > > > > > >
>> > > > > > > This is still a PUSH based approach as the RM would again call
>> > the
>> > > > > newly
>> > > > > > > introduced _updateDelegationToken_ when it encounters
>> > inconsistency
>> > > > > (eg.
>> > > > > > > due to a temporary network partition / a race condition we
>> didn't
>> > > > test
>> > > > > > for
>> > > > > > > / some other scenario we didn't think about). In practice
>> these
>> > > > > > > inconsistencies are super hard to avoid and reason about (and
>> > > > > > unfortunately
>> > > > > > > yes, we see them happen from time to time), so reusing the
>> > existing
>> > > > > > > mechanism that is designed for this exact problem simplify
>> > things.
>> > > > > > >
>> > > > > > > To sum this up we'd have three code paths for calling
>> > > > > > > _updateDelegationToken_:
>> > > > > > > 1) When the TM registers, we push the token (if DTM already
>> has
>> > it)
>> > > > to
>> > > > > it
>> > > > > > > 2) When DTM obtains a new token it broadcasts it to all
>> currently
>> > > > > > connected
>> > > > > > > TMs
>> > > > > > > 3) When a TM gets out of sync, DTM would reconcile it's state
>> > > > > > >
>> > > > > > > WDYT?
>> > > > > > >
>> > > > > > > Best,
>> > > > > > > D.
>> > > > > > >
>> > > > > > >
>> > > > > > > On Wed, Jan 26, 2022 at 9:03 PM David Morávek <
>> [email protected]>
>> > > > wrote:
>> > > > > > >
>> > > > > > > > Thanks the update, I'll go over it tomorrow.
>> > > > > > > >
>> > > > > > > > On Wed, Jan 26, 2022 at 5:33 PM Gabor Somogyi <
>> > > > > > [email protected]
>> > > > > > > >
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > >> Hi All,
>> > > > > > > >>
>> > > > > > > >> Since it has turned out that DTM can't be added as member
>> of
>> > > > > JobMaster
>> > > > > > > >> <
>> > > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://github.com/gaborgsomogyi/flink/blob/8ab75e46013f159778ccfce52463e7bc63e395a9/flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMaster.java#L176
>> > > > > > > >> >
>> > > > > > > >> I've
>> > > > > > > >> came up with a better proposal.
>> > > > > > > >> David, thanks for pinpointing this out, you've caught a
>> bug in
>> > > the
>> > > > > > early
>> > > > > > > >> phase!
>> > > > > > > >>
>> > > > > > > >> Namely ResourceManager
>> > > > > > > >> <
>> > > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/flink/blob/674bc96662285b25e395fd3dddf9291a602fc183/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/ResourceManager.java#L124
>> > > > > > > >> >
>> > > > > > > >> is
>> > > > > > > >> a single instance class where DTM can be added as member
>> > > variable.
>> > > > > > > >> It has a list of all already registered TMs and new TM
>> > > > registration
>> > > > > is
>> > > > > > > >> also
>> > > > > > > >> happening here.
>> > > > > > > >> The following can be added from logic perspective to be
>> more
>> > > > > specific:
>> > > > > > > >> * Create new DTM instance in ResourceManager
>> > > > > > > >> <
>> > > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/flink/blob/674bc96662285b25e395fd3dddf9291a602fc183/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/ResourceManager.java#L124
>> > > > > > > >> >
>> > > > > > > >> and
>> > > > > > > >> start it (re-occurring thread to obtain new tokens)
>> > > > > > > >> * Add a new function named "updateDelegationTokens" to
>> > > > > > > TaskExecutorGateway
>> > > > > > > >> <
>> > > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/flink/blob/674bc96662285b25e395fd3dddf9291a602fc183/flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskExecutorGateway.java#L54
>> > > > > > > >> >
>> > > > > > > >> * Call "updateDelegationTokens" on all registered TMs to
>> > > propagate
>> > > > > new
>> > > > > > > DTs
>> > > > > > > >> * In case of new TM registration call
>> "updateDelegationTokens"
>> > > > > before
>> > > > > > > >> registration succeeds to setup new TM properly
>> > > > > > > >>
>> > > > > > > >> This way:
>> > > > > > > >> * only a single DTM would live within a cluster which is
>> the
>> > > > > expected
>> > > > > > > >> behavior
>> > > > > > > >> * DTM is going to be added to a central place where all
>> > > deployment
>> > > > > > > target
>> > > > > > > >> can make use of it
>> > > > > > > >> * DTs are going to be pushed to TMs which would generate
>> less
>> > > > > network
>> > > > > > > >> traffic than pull based approach
>> > > > > > > >> (please see my previous mail where I've described both
>> > > approaches)
>> > > > > > > >> * HA scenario is going to be consistent because such
>> > > > > > > >> <
>> > > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/flink/blob/674bc96662285b25e395fd3dddf9291a602fc183/flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskExecutor.java#L1069
>> > > > > > > >> >
>> > > > > > > >> a solution can be added to "updateDelegationTokens"
>> > > > > > > >>
>> > > > > > > >> @David or all others plz share whether you agree on this or
>> > you
>> > > > have
>> > > > > > > >> better
>> > > > > > > >> idea/suggestion.
>> > > > > > > >>
>> > > > > > > >> BR,
>> > > > > > > >> G
>> > > > > > > >>
>> > > > > > > >>
>> > > > > > > >> On Tue, Jan 25, 2022 at 11:00 AM Gabor Somogyi <
>> > > > > > > [email protected]
>> > > > > > > >> >
>> > > > > > > >> wrote:
>> > > > > > > >>
>> > > > > > > >> > First of all thanks for investing your time and helping
>> me
>> > > out.
>> > > > > As I
>> > > > > > > see
>> > > > > > > >> > you have pretty solid knowledge in the RPC area.
>> > > > > > > >> > I would like to rely on your knowledge since I'm learning
>> > this
>> > > > > part.
>> > > > > > > >> >
>> > > > > > > >> > > - Do we need to introduce a new RPC method or can we
>> for
>> > > > example
>> > > > > > > >> > piggyback
>> > > > > > > >> > on heartbeats?
>> > > > > > > >> >
>> > > > > > > >> > I'm fine with either solution but one thing is important
>> > > > > > conceptually.
>> > > > > > > >> > There are fundamentally 2 ways how tokens can be updated:
>> > > > > > > >> > - Push way: When there are new DTs then JM JVM pushes
>> DTs to
>> > > TM
>> > > > > > JVMs.
>> > > > > > > >> This
>> > > > > > > >> > is the preferred one since tiny amount of control logic
>> > > needed.
>> > > > > > > >> > - Pull way: Each time a TM would like to poll JM whether
>> > there
>> > > > are
>> > > > > > new
>> > > > > > > >> > tokens and each TM wants to decide alone whether DTs
>> needs
>> > to
>> > > be
>> > > > > > > >> updated or
>> > > > > > > >> > not.
>> > > > > > > >> > As you've mentioned here some ID needs to be generated,
>> it
>> > > would
>> > > > > > > >> generated
>> > > > > > > >> > quite some additional network traffic which can be
>> > definitely
>> > > > > > avoided.
>> > > > > > > >> > As a final thought in Spark we've had this way of DT
>> > > propagation
>> > > > > > logic
>> > > > > > > >> and
>> > > > > > > >> > we've had major issues with it.
>> > > > > > > >> >
>> > > > > > > >> > So all in all DTM needs to obtain new tokens and there
>> must
>> > a
>> > > > way
>> > > > > to
>> > > > > > > >> send
>> > > > > > > >> > this data to all TMs from JM.
>> > > > > > > >> >
>> > > > > > > >> > > - What delivery semantics are we looking for? (what if
>> > we're
>> > > > > only
>> > > > > > > >> able to
>> > > > > > > >> > update subset of TMs / what happens if we exhaust
>> retries /
>> > > > should
>> > > > > > we
>> > > > > > > >> even
>> > > > > > > >> > have the retry mechanism whatsoever) - I have a feeling
>> that
>> > > > > somehow
>> > > > > > > >> > leveraging the existing heartbeat mechanism could help to
>> > > answer
>> > > > > > these
>> > > > > > > >> > questions
>> > > > > > > >> >
>> > > > > > > >> > Let's go through these questions one by one.
>> > > > > > > >> > > What delivery semantics are we looking for?
>> > > > > > > >> >
>> > > > > > > >> > DTM must receive an exception when at least one TM was
>> not
>> > > able
>> > > > to
>> > > > > > get
>> > > > > > > >> DTs.
>> > > > > > > >> >
>> > > > > > > >> > > what if we're only able to update subset of TMs?
>> > > > > > > >> >
>> > > > > > > >> > Such case DTM will reschedule token obtain after
>> > > > > > > >> > "security.kerberos.tokens.retry-wait" time.
>> > > > > > > >> >
>> > > > > > > >> > > what happens if we exhaust retries?
>> > > > > > > >> >
>> > > > > > > >> > There is no number of retries. In default configuration
>> > tokens
>> > > > > needs
>> > > > > > > to
>> > > > > > > >> be
>> > > > > > > >> > re-obtained after one day.
>> > > > > > > >> > DTM tries to obtain new tokens after 1day * 0.75
>> > > > > > > >> > (security.kerberos.tokens.renewal-ratio) = 18 hours.
>> > > > > > > >> > When fails it retries after
>> > > > "security.kerberos.tokens.retry-wait"
>> > > > > > > which
>> > > > > > > >> is
>> > > > > > > >> > 1 hour by default.
>> > > > > > > >> > If it never succeeds then authentication error is going
>> to
>> > > > happen
>> > > > > on
>> > > > > > > the
>> > > > > > > >> > TM side and the workload is
>> > > > > > > >> > going to stop.
>> > > > > > > >> >
>> > > > > > > >> > > should we even have the retry mechanism whatsoever?
>> > > > > > > >> >
>> > > > > > > >> > Yes, because there are always temporary cluster issues.
>> > > > > > > >> >
>> > > > > > > >> > > What does it mean for the running application (how does
>> > this
>> > > > > look
>> > > > > > > like
>> > > > > > > >> > from
>> > > > > > > >> > the user perspective)? As far as I remember the logs are
>> > only
>> > > > > > > collected
>> > > > > > > >> > ("aggregated") after the container is stopped, is that
>> > > correct?
>> > > > > > > >> >
>> > > > > > > >> > With default config it works like that but it can be
>> forced
>> > to
>> > > > > > > aggregate
>> > > > > > > >> > at specific intervals.
>> > > > > > > >> > A useful feature is forcing YARN to aggregate logs while
>> the
>> > > job
>> > > > > is
>> > > > > > > >> still
>> > > > > > > >> > running.
>> > > > > > > >> > For long-running jobs such as streaming jobs, this is
>> > > > invaluable.
>> > > > > To
>> > > > > > > do
>> > > > > > > >> > this,
>> > > > > > > >> >
>> > > > yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
>> > > > > > must
>> > > > > > > >> be
>> > > > > > > >> > set to a non-negative value.
>> > > > > > > >> > When this is set, a timer will be set for the given
>> > duration,
>> > > > and
>> > > > > > > >> whenever
>> > > > > > > >> > that timer goes off,
>> > > > > > > >> > log aggregation will run on new files.
>> > > > > > > >> >
>> > > > > > > >> > > I think
>> > > > > > > >> > this topic should get its own section in the FLIP (having
>> > some
>> > > > > cross
>> > > > > > > >> > reference to YARN ticket would be really useful, but I'm
>> not
>> > > > sure
>> > > > > if
>> > > > > > > >> there
>> > > > > > > >> > are any).
>> > > > > > > >> >
>> > > > > > > >> > I think this is important knowledge but this FLIP is not
>> > > > touching
>> > > > > > the
>> > > > > > > >> > already existing behavior.
>> > > > > > > >> > DTs are set on the AM container which is renewed by YARN
>> > until
>> > > > > it's
>> > > > > > > not
>> > > > > > > >> > possible anymore.
>> > > > > > > >> > Any kind of new code is not going to change this
>> limitation.
>> > > > BTW,
>> > > > > > > there
>> > > > > > > >> is
>> > > > > > > >> > no jira for this.
>> > > > > > > >> > If you think it worth to write this down then I think the
>> > good
>> > > > > place
>> > > > > > > is
>> > > > > > > >> > the official security doc
>> > > > > > > >> > area as caveat.
>> > > > > > > >> >
>> > > > > > > >> > > If we split the FLIP into two parts / sections that
>> I've
>> > > > > > suggested,
>> > > > > > > I
>> > > > > > > >> > don't
>> > > > > > > >> > really think that you need to explicitly test for each
>> > > > deployment
>> > > > > > > >> scenario
>> > > > > > > >> > / cluster framework, because the DTM part is completely
>> > > > > independent
>> > > > > > of
>> > > > > > > >> the
>> > > > > > > >> > deployment target. Basically this is what I'm aiming for
>> > with
>> > > > > > "making
>> > > > > > > it
>> > > > > > > >> > work with the standalone" (as simple as starting a new
>> java
>> > > > > process)
>> > > > > > > >> Flink
>> > > > > > > >> > first (which is also how most people deploy streaming
>> > > > application
>> > > > > on
>> > > > > > > k8s
>> > > > > > > >> > and the direction we're pushing forward with the
>> > auto-scaling
>> > > /
>> > > > > > > reactive
>> > > > > > > >> > mode initiatives).
>> > > > > > > >> >
>> > > > > > > >> > I see your point and agree the main direction. k8s is the
>> > > > > megatrend
>> > > > > > > >> which
>> > > > > > > >> > most of the peoples
>> > > > > > > >> > will use sooner or later. Not 100% sure what kind of
>> split
>> > you
>> > > > > > suggest
>> > > > > > > >> but
>> > > > > > > >> > in my view
>> > > > > > > >> > the main target is to add this feature and I'm open to
>> any
>> > > > logical
>> > > > > > > work
>> > > > > > > >> > ordering.
>> > > > > > > >> > Please share the specific details and we work it out...
>> > > > > > > >> >
>> > > > > > > >> > G
>> > > > > > > >> >
>> > > > > > > >> >
>> > > > > > > >> > On Mon, Jan 24, 2022 at 3:04 PM David Morávek <
>> > > [email protected]>
>> > > > > > > wrote:
>> > > > > > > >> >
>> > > > > > > >> >> >
>> > > > > > > >> >> > Could you point to a code where you think it could be
>> > added
>> > > > > > > exactly?
>> > > > > > > >> A
>> > > > > > > >> >> > helping hand is welcome here 🙂
>> > > > > > > >> >> >
>> > > > > > > >> >>
>> > > > > > > >> >> I think you can take a look at
>> > > > _ResourceManagerPartitionTracker_
>> > > > > > [1]
>> > > > > > > >> which
>> > > > > > > >> >> seems to have somewhat similar properties to the DTM.
>> > > > > > > >> >>
>> > > > > > > >> >> One topic that needs to be addressed there is how the
>> RPC
>> > > with
>> > > > > the
>> > > > > > > >> >> _TaskExecutorGateway_ should look like.
>> > > > > > > >> >> - Do we need to introduce a new RPC method or can we for
>> > > > example
>> > > > > > > >> piggyback
>> > > > > > > >> >> on heartbeats?
>> > > > > > > >> >> - What delivery semantics are we looking for? (what if
>> > we're
>> > > > only
>> > > > > > > able
>> > > > > > > >> to
>> > > > > > > >> >> update subset of TMs / what happens if we exhaust
>> retries /
>> > > > > should
>> > > > > > we
>> > > > > > > >> even
>> > > > > > > >> >> have the retry mechanism whatsoever) - I have a feeling
>> > that
>> > > > > > somehow
>> > > > > > > >> >> leveraging the existing heartbeat mechanism could help
>> to
>> > > > answer
>> > > > > > > these
>> > > > > > > >> >> questions
>> > > > > > > >> >>
>> > > > > > > >> >> In short, after DT reaches it's max lifetime then log
>> > > > aggregation
>> > > > > > > stops
>> > > > > > > >> >> >
>> > > > > > > >> >>
>> > > > > > > >> >> What does it mean for the running application (how does
>> > this
>> > > > look
>> > > > > > > like
>> > > > > > > >> >> from
>> > > > > > > >> >> the user perspective)? As far as I remember the logs are
>> > only
>> > > > > > > collected
>> > > > > > > >> >> ("aggregated") after the container is stopped, is that
>> > > > correct? I
>> > > > > > > think
>> > > > > > > >> >> this topic should get its own section in the FLIP
>> (having
>> > > some
>> > > > > > cross
>> > > > > > > >> >> reference to YARN ticket would be really useful, but I'm
>> > not
>> > > > sure
>> > > > > > if
>> > > > > > > >> there
>> > > > > > > >> >> are any).
>> > > > > > > >> >>
>> > > > > > > >> >> All deployment modes (per-job, per-app, ...) are
>> planned to
>> > > be
>> > > > > > tested
>> > > > > > > >> and
>> > > > > > > >> >> > expect to work with the initial implementation however
>> > not
>> > > > all
>> > > > > > > >> >> deployment
>> > > > > > > >> >> > targets (k8s, local, ...
>> > > > > > > >> >> >
>> > > > > > > >> >>
>> > > > > > > >> >> If we split the FLIP into two parts / sections that I've
>> > > > > > suggested, I
>> > > > > > > >> >> don't
>> > > > > > > >> >> really think that you need to explicitly test for each
>> > > > deployment
>> > > > > > > >> scenario
>> > > > > > > >> >> / cluster framework, because the DTM part is completely
>> > > > > independent
>> > > > > > > of
>> > > > > > > >> the
>> > > > > > > >> >> deployment target. Basically this is what I'm aiming for
>> > with
>> > > > > > "making
>> > > > > > > >> it
>> > > > > > > >> >> work with the standalone" (as simple as starting a new
>> java
>> > > > > > process)
>> > > > > > > >> Flink
>> > > > > > > >> >> first (which is also how most people deploy streaming
>> > > > application
>> > > > > > on
>> > > > > > > >> k8s
>> > > > > > > >> >> and the direction we're pushing forward with the
>> > > auto-scaling /
>> > > > > > > >> reactive
>> > > > > > > >> >> mode initiatives).
>> > > > > > > >> >>
>> > > > > > > >> >> The whole integration with YARN (let's forget about log
>> > > > > aggregation
>> > > > > > > >> for a
>> > > > > > > >> >> moment) / k8s-native only boils down to how do we make
>> the
>> > > > keytab
>> > > > > > > file
>> > > > > > > >> >> local to the JobManager so the DTM can read it, so it's
>> > > > basically
>> > > > > > > >> built on
>> > > > > > > >> >> top of that. The only special thing that needs to be
>> tested
>> > > > there
>> > > > > > is
>> > > > > > > >> the
>> > > > > > > >> >> "keytab distribution" code path.
>> > > > > > > >> >>
>> > > > > > > >> >> [1]
>> > > > > > > >> >>
>> > > > > > > >> >>
>> > > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/flink/blob/release-1.14.3/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/ResourceManagerPartitionTracker.java
>> > > > > > > >> >>
>> > > > > > > >> >> Best,
>> > > > > > > >> >> D.
>> > > > > > > >> >>
>> > > > > > > >> >> On Mon, Jan 24, 2022 at 12:35 PM Gabor Somogyi <
>> > > > > > > >> [email protected]
>> > > > > > > >> >> >
>> > > > > > > >> >> wrote:
>> > > > > > > >> >>
>> > > > > > > >> >> > > There is a separate JobMaster for each job
>> > > > > > > >> >> > within a Flink cluster and each JobMaster only has a
>> > > partial
>> > > > > view
>> > > > > > > of
>> > > > > > > >> the
>> > > > > > > >> >> > task managers
>> > > > > > > >> >> >
>> > > > > > > >> >> > Good point! I've had a deeper look and you're right.
>> We
>> > > > > > definitely
>> > > > > > > >> need
>> > > > > > > >> >> to
>> > > > > > > >> >> > find another place.
>> > > > > > > >> >> >
>> > > > > > > >> >> > > Related per-cluster or per-job keytab:
>> > > > > > > >> >> >
>> > > > > > > >> >> > In the current code per-cluster keytab is implemented
>> and
>> > > I'm
>> > > > > > > >> intended
>> > > > > > > >> >> to
>> > > > > > > >> >> > keep it like this within this FLIP. The reason is
>> simple:
>> > > > > tokens
>> > > > > > on
>> > > > > > > >> TM
>> > > > > > > >> >> side
>> > > > > > > >> >> > can be stored within the UserGroupInformation (UGI)
>> > > structure
>> > > > > > which
>> > > > > > > >> is
>> > > > > > > >> >> > global. I'm not telling it's impossible to change that
>> > but
>> > > I
>> > > > > > think
>> > > > > > > >> that
>> > > > > > > >> >> > this is such a complexity which the initial
>> > implementation
>> > > is
>> > > > > not
>> > > > > > > >> >> required
>> > > > > > > >> >> > to contain. Additionally we've not seen such need from
>> > user
>> > > > > side.
>> > > > > > > If
>> > > > > > > >> the
>> > > > > > > >> >> > need may rise later on then another FLIP with this
>> topic
>> > > can
>> > > > be
>> > > > > > > >> created
>> > > > > > > >> >> and
>> > > > > > > >> >> > discussed. Proper multi-UGI handling within a single
>> JVM
>> > > is a
>> > > > > > topic
>> > > > > > > >> >> where
>> > > > > > > >> >> > several round of deep-dive with the Hadoop/YARN guys
>> are
>> > > > > > required.
>> > > > > > > >> >> >
>> > > > > > > >> >> > > single DTM instance embedded with
>> > > > > > > >> >> > the ResourceManager (the Flink component)
>> > > > > > > >> >> >
>> > > > > > > >> >> > Could you point to a code where you think it could be
>> > added
>> > > > > > > exactly?
>> > > > > > > >> A
>> > > > > > > >> >> > helping hand is welcome here🙂
>> > > > > > > >> >> >
>> > > > > > > >> >> > > Then the single (initial) implementation should work
>> > with
>> > > > all
>> > > > > > the
>> > > > > > > >> >> > deployments modes out of the box (which is not what
>> the
>> > > FLIP
>> > > > > > > >> suggests).
>> > > > > > > >> >> Is
>> > > > > > > >> >> > that correct?
>> > > > > > > >> >> >
>> > > > > > > >> >> > All deployment modes (per-job, per-app, ...) are
>> planned
>> > to
>> > > > be
>> > > > > > > tested
>> > > > > > > >> >> and
>> > > > > > > >> >> > expect to work with the initial implementation however
>> > not
>> > > > all
>> > > > > > > >> >> deployment
>> > > > > > > >> >> > targets (k8s, local, ...) are not intended to be
>> tested.
>> > > Per
>> > > > > > > >> deployment
>> > > > > > > >> >> > target new jira needs to be created where I expect
>> small
>> > > > number
>> > > > > > of
>> > > > > > > >> codes
>> > > > > > > >> >> > needs to be added and relatively expensive testing
>> effort
>> > > is
>> > > > > > > >> required.
>> > > > > > > >> >> >
>> > > > > > > >> >> > > I've taken a look into the prototype and in the
>> > > > > > > >> >> "YarnClusterDescriptor"
>> > > > > > > >> >> > you're injecting a delegation token into the AM [1]
>> > (that's
>> > > > > > > obtained
>> > > > > > > >> >> using
>> > > > > > > >> >> > the provided keytab). If I understand this correctly
>> from
>> > > > > > previous
>> > > > > > > >> >> > discussion / FLIP, this is to support log aggregation
>> and
>> > > DT
>> > > > > has
>> > > > > > a
>> > > > > > > >> >> limited
>> > > > > > > >> >> > validity. How is this DT going to be renewed?
>> > > > > > > >> >> >
>> > > > > > > >> >> > You're clever and touched a limitation which Spark has
>> > too.
>> > > > In
>> > > > > > > short,
>> > > > > > > >> >> after
>> > > > > > > >> >> > DT reaches it's max lifetime then log aggregation
>> stops.
>> > > I've
>> > > > > had
>> > > > > > > >> >> several
>> > > > > > > >> >> > deep-dive rounds with the YARN guys at Spark years
>> > because
>> > > > > wanted
>> > > > > > > to
>> > > > > > > >> >> fill
>> > > > > > > >> >> > this gap. They can't provide us any way to re-inject
>> the
>> > > > newly
>> > > > > > > >> obtained
>> > > > > > > >> >> DT
>> > > > > > > >> >> > so at the end I gave up this.
>> > > > > > > >> >> >
>> > > > > > > >> >> > BR,
>> > > > > > > >> >> > G
>> > > > > > > >> >> >
>> > > > > > > >> >> >
>> > > > > > > >> >> > On Mon, 24 Jan 2022, 11:00 David Morávek, <
>> > [email protected]
>> > > >
>> > > > > > wrote:
>> > > > > > > >> >> >
>> > > > > > > >> >> > > Hi Gabor,
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > There is actually a huge difference between
>> JobManager
>> > > > > > (process)
>> > > > > > > >> and
>> > > > > > > >> >> > > JobMaster (job coordinator). The naming is
>> > unfortunately
>> > > > bit
>> > > > > > > >> >> misleading
>> > > > > > > >> >> > > here from historical reasons. There is a separate
>> > > JobMaster
>> > > > > for
>> > > > > > > >> each
>> > > > > > > >> >> job
>> > > > > > > >> >> > > within a Flink cluster and each JobMaster only has a
>> > > > partial
>> > > > > > view
>> > > > > > > >> of
>> > > > > > > >> >> the
>> > > > > > > >> >> > > task managers (depends on where the slots for a
>> > > particular
>> > > > > job
>> > > > > > > are
>> > > > > > > >> >> > > allocated). This means that you'll end up with N
>> > > > > > > >> >> > "DelegationTokenManagers"
>> > > > > > > >> >> > > competing with each other (N = number of running
>> jobs
>> > in
>> > > > the
>> > > > > > > >> cluster).
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > This makes me think we're mixing two abstraction
>> levels
>> > > > here:
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > a) Per-cluster delegation tokens
>> > > > > > > >> >> > > - Simpler approach, it would involve a single DTM
>> > > instance
>> > > > > > > embedded
>> > > > > > > >> >> with
>> > > > > > > >> >> > > the ResourceManager (the Flink component)
>> > > > > > > >> >> > > b) Per-job delegation tokens
>> > > > > > > >> >> > > - More complex approach, but could be more flexible
>> > from
>> > > > the
>> > > > > > user
>> > > > > > > >> >> side of
>> > > > > > > >> >> > > things.
>> > > > > > > >> >> > > - Multiple DTM instances, that are bound with the
>> > > JobMaster
>> > > > > > > >> lifecycle.
>> > > > > > > >> >> > > Delegation tokens are attached with a particular
>> slots
>> > > that
>> > > > > are
>> > > > > > > >> >> executing
>> > > > > > > >> >> > > the job tasks instead of the whole task manager (TM
>> > could
>> > > > be
>> > > > > > > >> executing
>> > > > > > > >> >> > > multiple jobs with different tokens).
>> > > > > > > >> >> > > - The question is which keytab should be used for
>> the
>> > > > > > clustering
>> > > > > > > >> >> > framework,
>> > > > > > > >> >> > > to support log aggregation on YARN (an extra keytab,
>> > > keytab
>> > > > > > that
>> > > > > > > >> comes
>> > > > > > > >> >> > with
>> > > > > > > >> >> > > the first job?)
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > I think these are the things that need to be
>> clarified
>> > in
>> > > > the
>> > > > > > > FLIP
>> > > > > > > >> >> before
>> > > > > > > >> >> > > proceeding.
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > A follow-up question for getting a better
>> understanding
>> > > > where
>> > > > > > > this
>> > > > > > > >> >> should
>> > > > > > > >> >> > > be headed: Are there any use cases where user may
>> want
>> > to
>> > > > use
>> > > > > > > >> >> different
>> > > > > > > >> >> > > keytabs with each job, or are we fine with using a
>> > > > > cluster-wide
>> > > > > > > >> >> keytab?
>> > > > > > > >> >> > If
>> > > > > > > >> >> > > we go with per-cluster keytabs, is it OK that all
>> jobs
>> > > > > > submitted
>> > > > > > > >> into
>> > > > > > > >> >> > this
>> > > > > > > >> >> > > cluster can access it (even the future ones)? Should
>> > this
>> > > > be
>> > > > > a
>> > > > > > > >> >> security
>> > > > > > > >> >> > > concern?
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > Presume you though I would implement a new class
>> with
>> > > > > > JobManager
>> > > > > > > >> name.
>> > > > > > > >> >> > The
>> > > > > > > >> >> > > > plan is not that.
>> > > > > > > >> >> > > >
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > I've never suggested such thing.
>> > > > > > > >> >> > >
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > > No. That said earlier DT handling is planned to be
>> > done
>> > > > > > > >> completely
>> > > > > > > >> >> in
>> > > > > > > >> >> > > > Flink. DTM has a renewal thread which re-obtains
>> > tokens
>> > > > in
>> > > > > > the
>> > > > > > > >> >> proper
>> > > > > > > >> >> > > time
>> > > > > > > >> >> > > > when needed.
>> > > > > > > >> >> > > >
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > Then the single (initial) implementation should work
>> > with
>> > > > all
>> > > > > > the
>> > > > > > > >> >> > > deployments modes out of the box (which is not what
>> the
>> > > > FLIP
>> > > > > > > >> >> suggests).
>> > > > > > > >> >> > Is
>> > > > > > > >> >> > > that correct?
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > If the cluster framework, also requires delegation
>> > token
>> > > > for
>> > > > > > > their
>> > > > > > > >> >> inner
>> > > > > > > >> >> > > working (this is IMO only applies to YARN), it might
>> > need
>> > > > an
>> > > > > > > extra
>> > > > > > > >> >> step
>> > > > > > > >> >> > > (injecting the token into application master
>> > container).
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > Separating the individual layers (actual Flink
>> cluster
>> > -
>> > > > > > > basically
>> > > > > > > >> >> making
>> > > > > > > >> >> > > this work with a standalone deployment  / "cluster
>> > > > > framework" -
>> > > > > > > >> >> support
>> > > > > > > >> >> > for
>> > > > > > > >> >> > > YARN log aggregation) in the FLIP would be useful.
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > Reading the linked Spark readme could be useful.
>> > > > > > > >> >> > > >
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > I've read that, but please be patient with the
>> > questions,
>> > > > > > > Kerberos
>> > > > > > > >> is
>> > > > > > > >> >> not
>> > > > > > > >> >> > > an easy topic to get into and I've had a very little
>> > > > contact
>> > > > > > with
>> > > > > > > >> it
>> > > > > > > >> >> in
>> > > > > > > >> >> > the
>> > > > > > > >> >> > > past.
>> > > > > > > >> >> > >
>> > > > > > > >> >> > >
>> > > > > > > >> >> > >
>> > > > > > > >> >> >
>> > > > > > > >> >>
>> > > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://github.com/gaborgsomogyi/flink/blob/8ab75e46013f159778ccfce52463e7bc63e395a9/flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMaster.java#L176
>> > > > > > > >> >> > > >
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > I've taken a look into the prototype and in the
>> > > > > > > >> >> "YarnClusterDescriptor"
>> > > > > > > >> >> > > you're injecting a delegation token into the AM [1]
>> > > (that's
>> > > > > > > >> obtained
>> > > > > > > >> >> > using
>> > > > > > > >> >> > > the provided keytab). If I understand this correctly
>> > from
>> > > > > > > previous
>> > > > > > > >> >> > > discussion / FLIP, this is to support log
>> aggregation
>> > and
>> > > > DT
>> > > > > > has
>> > > > > > > a
>> > > > > > > >> >> > limited
>> > > > > > > >> >> > > validity. How is this DT going to be renewed?
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > [1]
>> > > > > > > >> >> > >
>> > > > > > > >> >> > >
>> > > > > > > >> >> >
>> > > > > > > >> >>
>> > > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://github.com/gaborgsomogyi/flink/commit/8ab75e46013f159778ccfce52463e7bc63e395a9#diff-02416e2d6ca99e1456f9c3949f3d7c2ac523d3fe25378620c09632e4aac34e4eR1261
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > Best,
>> > > > > > > >> >> > > D.
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > On Fri, Jan 21, 2022 at 9:35 PM Gabor Somogyi <
>> > > > > > > >> >> [email protected]
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > wrote:
>> > > > > > > >> >> > >
>> > > > > > > >> >> > > > Here is the exact class, I'm from mobile so not
>> had a
>> > > > look
>> > > > > at
>> > > > > > > the
>> > > > > > > >> >> exact
>> > > > > > > >> >> > > > class name:
>> > > > > > > >> >> > > >
>> > > > > > > >> >> > > >
>> > > > > > > >> >> > >
>> > > > > > > >> >> >
>> > > > > > > >> >>
>> > > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://github.com/gaborgsomogyi/flink/blob/8ab75e46013f159778ccfce52463e7bc63e395a9/flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMaster.java#L176
>> > > > > > > >> >> > > > That keeps track of TMs where the tokens can be
>> sent
>> > > to.
>> > > > > > > >> >> > > >
>> > > > > > > >> >> > > > > My feeling would be that we shouldn't really
>> > > introduce
>> > > > a
>> > > > > > new
>> > > > > > > >> >> > component
>> > > > > > > >> >> > > > with
>> > > > > > > >> >> > > > a custom lifecycle, but rather we should try to
>> > > > incorporate
>> > > > > > > this
>> > > > > > > >> >> into
>> > > > > > > >> >> > > > existing ones.
>> > > > > > > >> >> > > >
>> > > > > > > >> >> > > > Can you be more specific? Presume you though I
>> would
>> > > > > > implement
>> > > > > > > a
>> > > > > > > >> new
>> > > > > > > >> >> > > class
>> > > > > > > >> >> > > > with JobManager name. The plan is not that.
>> > > > > > > >> >> > > >
>> > > > > > > >> >> > > > > If I understand this correctly, this means that
>> we
>> > > then
>> > > > > > push
>> > > > > > > >> the
>> > > > > > > >> >> > token
>> > > > > > > >> >> > > > renewal logic to YARN.
>> > > > > > > >> >> > > >
>> > > > > > > >> >> > > > No. That said earlier DT handling is planned to be
>> > done
>> > > > > > > >> completely
>> > > > > > > >> >> in
>> > > > > > > >> >> > > > Flink. DTM has a renewal thread which re-obtains
>> > tokens
>> > > > in
>> > > > > > the
>> > > > > > > >> >> proper
>> > > > > > > >> >> > > time
>> > > > > > > >> >> > > > when needed. YARN log aggregation is a totally
>> > > different
>> > > > > > > feature,
>> > > > > > > >> >> where
>> > > > > > > >> >> > > > YARN does the renewal. Log aggregation was an
>> example
>> > > why
>> > > > > the
>> > > > > > > >> code
>> > > > > > > >> >> > can't
>> > > > > > > >> >> > > be
>> > > > > > > >> >> > > > 100% reusable for all resource managers. Reading
>> the
>> > > > linked
>> > > > > > > Spark
>> > > > > > > >> >> > readme
>> > > > > > > >> >> > > > could be useful.
>> > > > > > > >> >> > > >
>> > > > > > > >> >> > > > G
>> > > > > > > >> >> > > >
>> > > > > > > >> >> > > > On Fri, 21 Jan 2022, 21:05 David Morávek, <
>> > > > [email protected]
>> > > > > >
>> > > > > > > >> wrote:
>> > > > > > > >> >> > > >
>> > > > > > > >> >> > > > > >
>> > > > > > > >> >> > > > > > JobManager is the Flink class.
>> > > > > > > >> >> > > > >
>> > > > > > > >> >> > > > >
>> > > > > > > >> >> > > > > There is no such class in Flink. The closest
>> thing
>> > to
>> > > > the
>> > > > > > > >> >> JobManager
>> > > > > > > >> >> > > is a
>> > > > > > > >> >> > > > > ClusterEntrypoint. The cluster entrypoint spawns
>> > new
>> > > RM
>> > > > > > > Runner
>> > > > > > > >> &
>> > > > > > > >> >> > > > Dispatcher
>> > > > > > > >> >> > > > > Runner that start participating in the leader
>> > > election.
>> > > > > > Once
>> > > > > > > >> they
>> > > > > > > >> >> > gain
>> > > > > > > >> >> > > > > leadership they spawn the actual underlying
>> > instances
>> > > > of
>> > > > > > > these
>> > > > > > > >> two
>> > > > > > > >> >> > > "main
>> > > > > > > >> >> > > > > components".
>> > > > > > > >> >> > > > >
>> > > > > > > >> >> > > > > My feeling would be that we shouldn't really
>> > > introduce
>> > > > a
>> > > > > > new
>> > > > > > > >> >> > component
>> > > > > > > >> >> > > > with
>> > > > > > > >> >> > > > > a custom lifecycle, but rather we should try to
>> > > > > incorporate
>> > > > > > > >> this
>> > > > > > > >> >> into
>> > > > > > > >> >> > > > > existing ones.
>> > > > > > > >> >> > > > >
>> > > > > > > >> >> > > > > My biggest concerns would be:
>> > > > > > > >> >> > > > >
>> > > > > > > >> >> > > > > - How would the lifecycle of the new component
>> look
>> > > > like
>> > > > > > with
>> > > > > > > >> >> regards
>> > > > > > > >> >> > > to
>> > > > > > > >> >> > > > HA
>> > > > > > > >> >> > > > > setups. If we really try to decide to introduce
>> a
>> > > > > > completely
>> > > > > > > >> new
>> > > > > > > >> >> > > > component,
>> > > > > > > >> >> > > > > how should this work in case of multiple
>> JobManager
>> > > > > > > instances?
>> > > > > > > >> >> > > > > - Which components does it talk to / how? For
>> > example
>> > > > how
>> > > > > > > does
>> > > > > > > >> the
>> > > > > > > >> >> > > > > broadcast of new token to task managers
>> > > > > > (TaskManagerGateway)
>> > > > > > > >> look
>> > > > > > > >> >> > like?
>> > > > > > > >> >> > > > Do
>> > > > > > > >> >> > > > > we simply introduce a new RPC on the
>> > > > > ResourceManagerGateway
>> > > > > > > >> that
>> > > > > > > >> >> > > > broadcasts
>> > > > > > > >> >> > > > > it or does the new component need to do some
>> kind
>> > of
>> > > > > > > >> bookkeeping
>> > > > > > > >> >> of
>> > > > > > > >> >> > > task
>> > > > > > > >> >> > > > > managers that it needs to notify?
>> > > > > > > >> >> > > > >
>> > > > > > > >> >> > > > > YARN based HDFS log aggregation would not work
>> by
>> > > > > dropping
>> > > > > > > that
>> > > > > > > >> >> code.
>> > > > > > > >> >> > > > Just
>> > > > > > > >> >> > > > > > to be crystal clear, the actual implementation
>> > > > contains
>> > > > > > > this
>> > > > > > > >> fir
>> > > > > > > >> >> > > > exactly
>> > > > > > > >> >> > > > > > this reason.
>> > > > > > > >> >> > > > > >
>> > > > > > > >> >> > > > >
>> > > > > > > >> >> > > > > This is the missing part +1. If I understand
>> this
>> > > > > > correctly,
>> > > > > > > >> this
>> > > > > > > >> >> > means
>> > > > > > > >> >> > > > > that we then push the token renewal logic to
>> YARN.
>> > > How
>> > > > do
>> > > > > > you
>> > > > > > > >> >> plan to
>> > > > > > > >> >> > > > > implement the renewal logic on k8s?
>> > > > > > > >> >> > > > >
>> > > > > > > >> >> > > > > D.
>> > > > > > > >> >> > > > >
>> > > > > > > >> >> > > > > On Fri, Jan 21, 2022 at 8:37 PM Gabor Somogyi <
>> > > > > > > >> >> > > [email protected]
>> > > > > > > >> >> > > > >
>> > > > > > > >> >> > > > > wrote:
>> > > > > > > >> >> > > > >
>> > > > > > > >> >> > > > > > > I think we might both mean something
>> different
>> > by
>> > > > the
>> > > > > > RM.
>> > > > > > > >> >> > > > > >
>> > > > > > > >> >> > > > > > You feel it well, I've not specified these
>> terms
>> > > well
>> > > > > in
>> > > > > > > the
>> > > > > > > >> >> > > > explanation.
>> > > > > > > >> >> > > > > > RM I meant resource management framework.
>> > > JobManager
>> > > > is
>> > > > > > the
>> > > > > > > >> >> Flink
>> > > > > > > >> >> > > > class.
>> > > > > > > >> >> > > > > > This means that inside JM instance there will
>> be
>> > a
>> > > > DTM
>> > > > > > > >> >> instance, so
>> > > > > > > >> >> > > > they
>> > > > > > > >> >> > > > > > would have the same lifecycle. Hope I've
>> answered
>> > > the
>> > > > > > > >> question.
>> > > > > > > >> >> > > > > >
>> > > > > > > >> >> > > > > > > If we have tokens available on the client
>> side,
>> > > why
>> > > > > do
>> > > > > > we
>> > > > > > > >> >> need to
>> > > > > > > >> >> > > set
>> > > > > > > >> >> > > > > > them
>> > > > > > > >> >> > > > > > into the AM (yarn specific concept) launch
>> > context?
>> > > > > > > >> >> > > > > >
>> > > > > > > >> >> > > > > > YARN based HDFS log aggregation would not
>> work by
>> > > > > > dropping
>> > > > > > > >> that
>> > > > > > > >> >> > code.
>> > > > > > > >> >> > > > > Just
>> > > > > > > >> >> > > > > > to be crystal clear, the actual implementation
>> > > > contains
>> > > > > > > this
>> > > > > > > >> fir
>> > > > > > > >> >> > > > exactly
>> > > > > > > >> >> > > > > > this reason.
>> > > > > > > >> >> > > > > >
>> > > > > > > >> >> > > > > > G
>> > > > > > > >> >> > > > > >
>> > > > > > > >> >> > > > > > On Fri, 21 Jan 2022, 20:12 David Morávek, <
>> > > > > > [email protected]
>> > > > > > > >
>> > > > > > > >> >> wrote:
>> > > > > > > >> >> > > > > >
>> > > > > > > >> >> > > > > > > Hi Gabor,
>> > > > > > > >> >> > > > > > >
>> > > > > > > >> >> > > > > > > 1. One thing is important, token management
>> is
>> > > > > planned
>> > > > > > to
>> > > > > > > >> be
>> > > > > > > >> >> done
>> > > > > > > >> >> > > > > > > > generically within Flink and not
>> scattered in
>> > > RM
>> > > > > > > specific
>> > > > > > > >> >> code.
>> > > > > > > >> >> > > > > > > JobManager
>> > > > > > > >> >> > > > > > > > has a DelegationTokenManager which obtains
>> > > tokens
>> > > > > > > >> >> time-to-time
>> > > > > > > >> >> > > (if
>> > > > > > > >> >> > > > > > > > configured properly). JM knows which
>> > > TaskManagers
>> > > > > are
>> > > > > > > in
>> > > > > > > >> >> place
>> > > > > > > >> >> > so
>> > > > > > > >> >> > > > it
>> > > > > > > >> >> > > > > > can
>> > > > > > > >> >> > > > > > > > distribute it to all TMs. That's it
>> > basically.
>> > > > > > > >> >> > > > > > >
>> > > > > > > >> >> > > > > > >
>> > > > > > > >> >> > > > > > > I think we might both mean something
>> different
>> > by
>> > > > the
>> > > > > > RM.
>> > > > > > > >> >> > > JobManager
>> > > > > > > >> >> > > > is
>> > > > > > > >> >> > > > > > > basically just a process encapsulating
>> multiple
>> > > > > > > components,
>> > > > > > > >> >> one
>> > > > > > > >> >> > of
>> > > > > > > >> >> > > > > which
>> > > > > > > >> >> > > > > > is
>> > > > > > > >> >> > > > > > > a ResourceManager, which is the component
>> that
>> > > > > manages
>> > > > > > > task
>> > > > > > > >> >> > manager
>> > > > > > > >> >> > > > > > > registrations [1]. There is more or less a
>> > single
>> > > > > > > >> >> implementation
>> > > > > > > >> >> > of
>> > > > > > > >> >> > > > the
>> > > > > > > >> >> > > > > > RM
>> > > > > > > >> >> > > > > > > with plugable drivers for the active
>> > integrations
>> > > > > > (yarn,
>> > > > > > > >> k8s).
>> > > > > > > >> >> > > > > > >
>> > > > > > > >> >> > > > > > > It would be great if you could share more
>> > details
>> > > > of
>> > > > > > how
>> > > > > > > >> >> exactly
>> > > > > > > >> >> > > the
>> > > > > > > >> >> > > > > DTM
>> > > > > > > >> >> > > > > > is
>> > > > > > > >> >> > > > > > > going to fit in the current JM architecture.
>> > > > > > > >> >> > > > > > >
>> > > > > > > >> >> > > > > > > 2. 99.9% of the code is generic but each RM
>> > > handles
>> > > > > > > tokens
>> > > > > > > >> >> > > > > differently. A
>> > > > > > > >> >> > > > > > > > good example is YARN obtains tokens on
>> client
>> > > > side
>> > > > > > and
>> > > > > > > >> then
>> > > > > > > >> >> > sets
>> > > > > > > >> >> > > > them
>> > > > > > > >> >> > > > > > on
>> > > > > > > >> >> > > > > > > > the newly created AM container launch
>> > context.
>> > > > This
>> > > > > > is
>> > > > > > > >> >> purely
>> > > > > > > >> >> > > YARN
>> > > > > > > >> >> > > > > > > specific
>> > > > > > > >> >> > > > > > > > and cant't be spared. With my actual plans
>> > > > > standalone
>> > > > > > > >> can be
>> > > > > > > >> >> > > > changed
>> > > > > > > >> >> > > > > to
>> > > > > > > >> >> > > > > > > use
>> > > > > > > >> >> > > > > > > > the framework. By using it I mean no RM
>> > > specific
>> > > > > DTM
>> > > > > > or
>> > > > > > > >> >> > > whatsoever
>> > > > > > > >> >> > > > is
>> > > > > > > >> >> > > > > > > > needed.
>> > > > > > > >> >> > > > > > > >
>> > > > > > > >> >> > > > > > >
>> > > > > > > >> >> > > > > > > If we have tokens available on the client
>> side,
>> > > why
>> > > > > do
>> > > > > > we
>> > > > > > > >> >> need to
>> > > > > > > >> >> > > set
>> > > > > > > >> >> > > > > > them
>> > > > > > > >> >> > > > > > > into the AM (yarn specific concept) launch
>> > > context?
>> > > > > Why
>> > > > > > > >> can't
>> > > > > > > >> >> we
>> > > > > > > >> >> > > > simply
>> > > > > > > >> >> > > > > > > send them to the JM, eg. as a parameter of
>> the
>> > > job
>> > > > > > > >> submission
>> > > > > > > >> >> /
>> > > > > > > >> >> > via
>> > > > > > > >> >> > > > > > > separate RPC call? There might be something
>> I'm
>> > > > > missing
>> > > > > > > >> due to
>> > > > > > > >> >> > > > limited
>> > > > > > > >> >> > > > > > > knowledge, but handling the token on the
>> > "cluster
>> > > > > > > >> framework"
>> > > > > > > >> >> > level
>> > > > > > > >> >> > > > > > doesn't
>> > > > > > > >> >> > > > > > > seem necessary.
>> > > > > > > >> >> > > > > > >
>> > > > > > > >> >> > > > > > > [1]
>> > > > > > > >> >> > > > > > >
>> > > > > > > >> >> > > > > > >
>> > > > > > > >> >> > > > > >
>> > > > > > > >> >> > > > >
>> > > > > > > >> >> > > >
>> > > > > > > >> >> > >
>> > > > > > > >> >> >
>> > > > > > > >> >>
>> > > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/concepts/flink-architecture/#jobmanager
>> > > > > > > >> >> > > > > > >
>> > > > > > > >> >> > > > > > > Best,
>> > > > > > > >> >> > > > > > > D.
>> > > > > > > >> >> > > > > > >
>> > > > > > > >> >> > > > > > > On Fri, Jan 21, 2022 at 7:48 PM Gabor
>> Somogyi <
>> > > > > > > >> >> > > > > [email protected]
>> > > > > > > >> >> > > > > > >
>> > > > > > > >> >> > > > > > > wrote:
>> > > > > > > >> >> > > > > > >
>> > > > > > > >> >> > > > > > > > Oh and one more thing. I'm planning to add
>> > this
>> > > > > > feature
>> > > > > > > >> in
>> > > > > > > >> >> > small
>> > > > > > > >> >> > > > > chunk
>> > > > > > > >> >> > > > > > of
>> > > > > > > >> >> > > > > > > > PRs because security is super hairy area.
>> > That
>> > > > way
>> > > > > > > >> reviewers
>> > > > > > > >> >> > can
>> > > > > > > >> >> > > be
>> > > > > > > >> >> > > > > > more
>> > > > > > > >> >> > > > > > > > easily obtains the concept.
>> > > > > > > >> >> > > > > > > >
>> > > > > > > >> >> > > > > > > > On Fri, 21 Jan 2022, 18:03 David Morávek,
>> <
>> > > > > > > >> [email protected]>
>> > > > > > > >> >> > > wrote:
>> > > > > > > >> >> > > > > > > >
>> > > > > > > >> >> > > > > > > > > Hi Gabor,
>> > > > > > > >> >> > > > > > > > >
>> > > > > > > >> >> > > > > > > > > thanks for drafting the FLIP, I think
>> > having
>> > > a
>> > > > > > solid
>> > > > > > > >> >> Kerberos
>> > > > > > > >> >> > > > > support
>> > > > > > > >> >> > > > > > > is
>> > > > > > > >> >> > > > > > > > > crucial for many enterprise deployments.
>> > > > > > > >> >> > > > > > > > >
>> > > > > > > >> >> > > > > > > > > I have multiple questions regarding the
>> > > > > > > implementation
>> > > > > > > >> >> (note
>> > > > > > > >> >> > > > that I
>> > > > > > > >> >> > > > > > > have
>> > > > > > > >> >> > > > > > > > > very limited knowledge of Kerberos):
>> > > > > > > >> >> > > > > > > > >
>> > > > > > > >> >> > > > > > > > > 1) If I understand it correctly, we'll
>> only
>> > > > > obtain
>> > > > > > > >> tokens
>> > > > > > > >> >> in
>> > > > > > > >> >> > > the
>> > > > > > > >> >> > > > > job
>> > > > > > > >> >> > > > > > > > > manager and then we'll distribute them
>> via
>> > > RPC
>> > > > > > (needs
>> > > > > > > >> to
>> > > > > > > >> >> be
>> > > > > > > >> >> > > > > secured).
>> > > > > > > >> >> > > > > > > > >
>> > > > > > > >> >> > > > > > > > > Can you please outline how the
>> > communication
>> > > > will
>> > > > > > > look
>> > > > > > > >> >> like?
>> > > > > > > >> >> > Is
>> > > > > > > >> >> > > > the
>> > > > > > > >> >> > > > > > > > > DelegationTokenManager going to be a
>> part
>> > of
>> > > > the
>> > > > > > > >> >> > > ResourceManager?
>> > > > > > > >> >> > > > > Can
>> > > > > > > >> >> > > > > > > you
>> > > > > > > >> >> > > > > > > > > outline it's lifecycle / how it's going
>> to
>> > be
>> > > > > > > >> integrated
>> > > > > > > >> >> > there?
>> > > > > > > >> >> > > > > > > > >
>> > > > > > > >> >> > > > > > > > > 2) Do we really need a YARN / k8s
>> specific
>> > > > > > > >> >> implementations?
>> > > > > > > >> >> > Is
>> > > > > > > >> >> > > it
>> > > > > > > >> >> > > > > > > > possible
>> > > > > > > >> >> > > > > > > > > to obtain / renew a token in a generic
>> way?
>> > > > Maybe
>> > > > > > to
>> > > > > > > >> >> rephrase
>> > > > > > > >> >> > > > that,
>> > > > > > > >> >> > > > > > is
>> > > > > > > >> >> > > > > > > it
>> > > > > > > >> >> > > > > > > > > possible to implement
>> > DelegationTokenManager
>> > > > for
>> > > > > > the
>> > > > > > > >> >> > standalone
>> > > > > > > >> >> > > > > > Flink?
>> > > > > > > >> >> > > > > > > If
>> > > > > > > >> >> > > > > > > > > we're able to solve this point, it
>> could be
>> > > > > > possible
>> > > > > > > to
>> > > > > > > >> >> > target
>> > > > > > > >> >> > > > all
>> > > > > > > >> >> > > > > > > > > deployment scenarios with a single
>> > > > > implementation.
>> > > > > > > >> >> > > > > > > > >
>> > > > > > > >> >> > > > > > > > > Best,
>> > > > > > > >> >> > > > > > > > > D.
>> > > > > > > >> >> > > > > > > > >
>> > > > > > > >> >> > > > > > > > > On Fri, Jan 14, 2022 at 3:47 AM Junfan
>> > Zhang
>> > > <
>> > > > > > > >> >> > > > > > [email protected]>
>> > > > > > > >> >> > > > > > > > > wrote:
>> > > > > > > >> >> > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > Hi G
>> > > > > > > >> >> > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > Thanks for your explain in detail. I
>> have
>> > > > > gotten
>> > > > > > > your
>> > > > > > > >> >> > > thoughts,
>> > > > > > > >> >> > > > > and
>> > > > > > > >> >> > > > > > > any
>> > > > > > > >> >> > > > > > > > > > way this proposal
>> > > > > > > >> >> > > > > > > > > > is a great improvement.
>> > > > > > > >> >> > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > Looking forward to your implementation
>> > and
>> > > i
>> > > > > will
>> > > > > > > >> keep
>> > > > > > > >> >> > focus
>> > > > > > > >> >> > > on
>> > > > > > > >> >> > > > > it.
>> > > > > > > >> >> > > > > > > > > > Thanks again.
>> > > > > > > >> >> > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > Best
>> > > > > > > >> >> > > > > > > > > > JunFan.
>> > > > > > > >> >> > > > > > > > > > On Jan 13, 2022, 9:20 PM +0800, Gabor
>> > > > Somogyi <
>> > > > > > > >> >> > > > > > > > [email protected]
>> > > > > > > >> >> > > > > > > > > >,
>> > > > > > > >> >> > > > > > > > > > wrote:
>> > > > > > > >> >> > > > > > > > > > > Just to confirm keeping
>> > > > > > > >> >> > > > > > "security.kerberos.fetch.delegation-token"
>> > > > > > > >> >> > > > > > > is
>> > > > > > > >> >> > > > > > > > > > added
>> > > > > > > >> >> > > > > > > > > > > to the doc.
>> > > > > > > >> >> > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > BR,
>> > > > > > > >> >> > > > > > > > > > > G
>> > > > > > > >> >> > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > On Thu, Jan 13, 2022 at 1:34 PM
>> Gabor
>> > > > > Somogyi <
>> > > > > > > >> >> > > > > > > > > [email protected]
>> > > > > > > >> >> > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > wrote:
>> > > > > > > >> >> > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > > Hi JunFan,
>> > > > > > > >> >> > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > > > By the way, maybe this should be
>> > > added
>> > > > in
>> > > > > > the
>> > > > > > > >> >> > migration
>> > > > > > > >> >> > > > > plan
>> > > > > > > >> >> > > > > > or
>> > > > > > > >> >> > > > > > > > > > > > intergation section in the
>> FLIP-211.
>> > > > > > > >> >> > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > > Going to add this soon.
>> > > > > > > >> >> > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > > > Besides, I have a question that
>> the
>> > > KDC
>> > > > > > will
>> > > > > > > >> >> collapse
>> > > > > > > >> >> > > > when
>> > > > > > > >> >> > > > > > the
>> > > > > > > >> >> > > > > > > > > > cluster
>> > > > > > > >> >> > > > > > > > > > > > reached 200 nodes you described
>> > > > > > > >> >> > > > > > > > > > > > in the google doc. Do you have any
>> > > > > attachment
>> > > > > > > or
>> > > > > > > >> >> > > reference
>> > > > > > > >> >> > > > to
>> > > > > > > >> >> > > > > > > prove
>> > > > > > > >> >> > > > > > > > > it?
>> > > > > > > >> >> > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > > "KDC *may* collapse under some
>> > > > > circumstances"
>> > > > > > > is
>> > > > > > > >> the
>> > > > > > > >> >> > > proper
>> > > > > > > >> >> > > > > > > > wording.
>> > > > > > > >> >> > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > > We have several customers who are
>> > > > executing
>> > > > > > > >> >> workloads
>> > > > > > > >> >> > on
>> > > > > > > >> >> > > > > > > > Spark/Flink.
>> > > > > > > >> >> > > > > > > > > > Most
>> > > > > > > >> >> > > > > > > > > > > > of the time I'm facing their
>> > > > > > > >> >> > > > > > > > > > > > daily issues which is heavily
>> > > environment
>> > > > > and
>> > > > > > > >> >> use-case
>> > > > > > > >> >> > > > > > dependent.
>> > > > > > > >> >> > > > > > > > > I've
>> > > > > > > >> >> > > > > > > > > > > > seen various cases:
>> > > > > > > >> >> > > > > > > > > > > > * where the mentioned ~1k nodes
>> were
>> > > > > working
>> > > > > > > fine
>> > > > > > > >> >> > > > > > > > > > > > * where KDC thought the number of
>> > > > requests
>> > > > > > are
>> > > > > > > >> >> coming
>> > > > > > > >> >> > > from
>> > > > > > > >> >> > > > > DDOS
>> > > > > > > >> >> > > > > > > > > attack
>> > > > > > > >> >> > > > > > > > > > so
>> > > > > > > >> >> > > > > > > > > > > > discontinued authentication
>> > > > > > > >> >> > > > > > > > > > > > * where KDC was simply not
>> responding
>> > > > > because
>> > > > > > > of
>> > > > > > > >> the
>> > > > > > > >> >> > load
>> > > > > > > >> >> > > > > > > > > > > > * where KDC was intermittently had
>> > some
>> > > > > > outage
>> > > > > > > >> (this
>> > > > > > > >> >> > was
>> > > > > > > >> >> > > > the
>> > > > > > > >> >> > > > > > most
>> > > > > > > >> >> > > > > > > > > nasty
>> > > > > > > >> >> > > > > > > > > > > > thing)
>> > > > > > > >> >> > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > > Since you're managing relatively
>> big
>> > > > > cluster
>> > > > > > > then
>> > > > > > > >> >> you
>> > > > > > > >> >> > > know
>> > > > > > > >> >> > > > > that
>> > > > > > > >> >> > > > > > > KDC
>> > > > > > > >> >> > > > > > > > > is
>> > > > > > > >> >> > > > > > > > > > not
>> > > > > > > >> >> > > > > > > > > > > > only used by Spark/Flink workloads
>> > > > > > > >> >> > > > > > > > > > > > but the whole company IT
>> > infrastructure
>> > > > is
>> > > > > > > >> bombing
>> > > > > > > >> >> it
>> > > > > > > >> >> > so
>> > > > > > > >> >> > > it
>> > > > > > > >> >> > > > > > > really
>> > > > > > > >> >> > > > > > > > > > depends
>> > > > > > > >> >> > > > > > > > > > > > on other factors too whether KDC
>> is
>> > > > > reaching
>> > > > > > > >> >> > > > > > > > > > > > it's limit or not. Not sure what
>> kind
>> > > of
>> > > > > > > evidence
>> > > > > > > >> >> are
>> > > > > > > >> >> > you
>> > > > > > > >> >> > > > > > looking
>> > > > > > > >> >> > > > > > > > for
>> > > > > > > >> >> > > > > > > > > > but
>> > > > > > > >> >> > > > > > > > > > > > I'm not authorized to share any
>> > > > information
>> > > > > > > about
>> > > > > > > >> >> > > > > > > > > > > > our clients data.
>> > > > > > > >> >> > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > > One thing is for sure. The more
>> > > external
>> > > > > > system
>> > > > > > > >> >> types
>> > > > > > > >> >> > are
>> > > > > > > >> >> > > > > used
>> > > > > > > >> >> > > > > > in
>> > > > > > > >> >> > > > > > > > > > > > workloads (for ex. HDFS, HBase,
>> Hive,
>> > > > > Kafka)
>> > > > > > > >> which
>> > > > > > > >> >> > > > > > > > > > > > are authenticating through KDC the
>> > more
>> > > > > > > >> possibility
>> > > > > > > >> >> to
>> > > > > > > >> >> > > > reach
>> > > > > > > >> >> > > > > > this
>> > > > > > > >> >> > > > > > > > > > > > threshold when the cluster is big
>> > > enough.
>> > > > > > > >> >> > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > > All in all this feature is here to
>> > help
>> > > > all
>> > > > > > > users
>> > > > > > > >> >> never
>> > > > > > > >> >> > > > reach
>> > > > > > > >> >> > > > > > > this
>> > > > > > > >> >> > > > > > > > > > > > limitation.
>> > > > > > > >> >> > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > > BR,
>> > > > > > > >> >> > > > > > > > > > > > G
>> > > > > > > >> >> > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > > On Thu, Jan 13, 2022 at 1:00 PM
>> 张俊帆 <
>> > > > > > > >> >> > > > [email protected]
>> > > > > > > >> >> > > > > >
>> > > > > > > >> >> > > > > > > > wrote:
>> > > > > > > >> >> > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > > > Hi G
>> > > > > > > >> >> > > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > > > Thanks for your quick reply. I
>> > think
>> > > > > > > reserving
>> > > > > > > >> the
>> > > > > > > >> >> > > config
>> > > > > > > >> >> > > > > of
>> > > > > > > >> >> > > > > > > > > > > > >
>> > > > > *security.kerberos.fetch.delegation-token*
>> > > > > > > >> >> > > > > > > > > > > > > and simplifying disable the
>> token
>> > > > > fetching
>> > > > > > > is a
>> > > > > > > >> >> good
>> > > > > > > >> >> > > > > idea.By
>> > > > > > > >> >> > > > > > > the
>> > > > > > > >> >> > > > > > > > > way,
>> > > > > > > >> >> > > > > > > > > > > > > maybe this should be added
>> > > > > > > >> >> > > > > > > > > > > > > in the migration plan or
>> > intergation
>> > > > > > section
>> > > > > > > in
>> > > > > > > >> >> the
>> > > > > > > >> >> > > > > FLIP-211.
>> > > > > > > >> >> > > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > > > Besides, I have a question that
>> the
>> > > KDC
>> > > > > > will
>> > > > > > > >> >> collapse
>> > > > > > > >> >> > > > when
>> > > > > > > >> >> > > > > > the
>> > > > > > > >> >> > > > > > > > > > cluster
>> > > > > > > >> >> > > > > > > > > > > > > reached 200 nodes you described
>> > > > > > > >> >> > > > > > > > > > > > > in the google doc. Do you have
>> any
>> > > > > > attachment
>> > > > > > > >> or
>> > > > > > > >> >> > > > reference
>> > > > > > > >> >> > > > > to
>> > > > > > > >> >> > > > > > > > prove
>> > > > > > > >> >> > > > > > > > > > it?
>> > > > > > > >> >> > > > > > > > > > > > > Because in our internal
>> > per-cluster,
>> > > > > > > >> >> > > > > > > > > > > > > the nodes reaches > 1000 and KDC
>> > > looks
>> > > > > > good.
>> > > > > > > >> Do i
>> > > > > > > >> >> > > missed
>> > > > > > > >> >> > > > or
>> > > > > > > >> >> > > > > > > > > > misunderstood
>> > > > > > > >> >> > > > > > > > > > > > > something? Please correct me.
>> > > > > > > >> >> > > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > > > Best
>> > > > > > > >> >> > > > > > > > > > > > > JunFan.
>> > > > > > > >> >> > > > > > > > > > > > > On Jan 13, 2022, 5:26 PM +0800,
>> > > > > > > >> >> [email protected]
>> > > > > > > >> >> > ,
>> > > > > > > >> >> > > > > wrote:
>> > > > > > > >> >> > > > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > >
>> > > > > > > >> >> > > > > > > > >
>> > > > > > > >> >> > > > > > > >
>> > > > > > > >> >> > > > > > >
>> > > > > > > >> >> > > > > >
>> > > > > > > >> >> > > > >
>> > > > > > > >> >> > > >
>> > > > > > > >> >> > >
>> > > > > > > >> >> >
>> > > > > > > >> >>
>> > > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://docs.google.com/document/d/1JzMbQ1pCJsLVz8yHrCxroYMRP2GwGwvacLrGyaIx5Yc/edit?fbclid=IwAR0vfeJvAbEUSzHQAAJfnWTaX46L6o7LyXhMfBUCcPrNi-uXNgoOaI8PMDQ
>> > > > > > > >> >> > > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > > > >
>> > > > > > > >> >> > > > > > > > > >
>> > > > > > > >> >> > > > > > > > >
>> > > > > > > >> >> > > > > > > >
>> > > > > > > >> >> > > > > > >
>> > > > > > > >> >> > > > > >
>> > > > > > > >> >> > > > >
>> > > > > > > >> >> > > >
>> > > > > > > >> >> > >
>> > > > > > > >> >> >
>> > > > > > > >> >>
>> > > > > > > >> >
>> > > > > > > >>
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Re: [DISCUSS] FLIP-211: Kerberos delegation token framework

Reply via email to