Hi G Thanks for your explain in detail. I have gotten your thoughts, and any way this proposal is a great improvement.
Looking forward to your implementation and i will keep focus on it. Thanks again. Best JunFan. On Jan 13, 2022, 9:20 PM +0800, Gabor Somogyi <gabor.g.somo...@gmail.com>, wrote: > Just to confirm keeping "security.kerberos.fetch.delegation-token" is added > to the doc. > > BR, > G > > > On Thu, Jan 13, 2022 at 1:34 PM Gabor Somogyi <gabor.g.somo...@gmail.com> > wrote: > > > Hi JunFan, > > > > > By the way, maybe this should be added in the migration plan or > > intergation section in the FLIP-211. > > > > Going to add this soon. > > > > > Besides, I have a question that the KDC will collapse when the cluster > > reached 200 nodes you described > > in the google doc. Do you have any attachment or reference to prove it? > > > > "KDC *may* collapse under some circumstances" is the proper wording. > > > > We have several customers who are executing workloads on Spark/Flink. Most > > of the time I'm facing their > > daily issues which is heavily environment and use-case dependent. I've > > seen various cases: > > * where the mentioned ~1k nodes were working fine > > * where KDC thought the number of requests are coming from DDOS attack so > > discontinued authentication > > * where KDC was simply not responding because of the load > > * where KDC was intermittently had some outage (this was the most nasty > > thing) > > > > Since you're managing relatively big cluster then you know that KDC is not > > only used by Spark/Flink workloads > > but the whole company IT infrastructure is bombing it so it really depends > > on other factors too whether KDC is reaching > > it's limit or not. Not sure what kind of evidence are you looking for but > > I'm not authorized to share any information about > > our clients data. > > > > One thing is for sure. The more external system types are used in > > workloads (for ex. HDFS, HBase, Hive, Kafka) which > > are authenticating through KDC the more possibility to reach this > > threshold when the cluster is big enough. > > > > All in all this feature is here to help all users never reach this > > limitation. > > > > BR, > > G > > > > > > On Thu, Jan 13, 2022 at 1:00 PM 张俊帆 <zuston.sha...@gmail.com> wrote: > > > > > Hi G > > > > > > Thanks for your quick reply. I think reserving the config of > > > *security.kerberos.fetch.delegation-token* > > > and simplifying disable the token fetching is a good idea.By the way, > > > maybe this should be added > > > in the migration plan or intergation section in the FLIP-211. > > > > > > Besides, I have a question that the KDC will collapse when the cluster > > > reached 200 nodes you described > > > in the google doc. Do you have any attachment or reference to prove it? > > > Because in our internal per-cluster, > > > the nodes reaches > 1000 and KDC looks good. Do i missed or misunderstood > > > something? Please correct me. > > > > > > Best > > > JunFan. > > > On Jan 13, 2022, 5:26 PM +0800, dev@flink.apache.org, wrote: > > > > > > > > > > > https://docs.google.com/document/d/1JzMbQ1pCJsLVz8yHrCxroYMRP2GwGwvacLrGyaIx5Yc/edit?fbclid=IwAR0vfeJvAbEUSzHQAAJfnWTaX46L6o7LyXhMfBUCcPrNi-uXNgoOaI8PMDQ > > > > >