Just to confirm keeping "security.kerberos.fetch.delegation-token" is added to the doc.
BR, G On Thu, Jan 13, 2022 at 1:34 PM Gabor Somogyi <gabor.g.somo...@gmail.com> wrote: > Hi JunFan, > > > By the way, maybe this should be added in the migration plan or > intergation section in the FLIP-211. > > Going to add this soon. > > > Besides, I have a question that the KDC will collapse when the cluster > reached 200 nodes you described > in the google doc. Do you have any attachment or reference to prove it? > > "KDC *may* collapse under some circumstances" is the proper wording. > > We have several customers who are executing workloads on Spark/Flink. Most > of the time I'm facing their > daily issues which is heavily environment and use-case dependent. I've > seen various cases: > * where the mentioned ~1k nodes were working fine > * where KDC thought the number of requests are coming from DDOS attack so > discontinued authentication > * where KDC was simply not responding because of the load > * where KDC was intermittently had some outage (this was the most nasty > thing) > > Since you're managing relatively big cluster then you know that KDC is not > only used by Spark/Flink workloads > but the whole company IT infrastructure is bombing it so it really depends > on other factors too whether KDC is reaching > it's limit or not. Not sure what kind of evidence are you looking for but > I'm not authorized to share any information about > our clients data. > > One thing is for sure. The more external system types are used in > workloads (for ex. HDFS, HBase, Hive, Kafka) which > are authenticating through KDC the more possibility to reach this > threshold when the cluster is big enough. > > All in all this feature is here to help all users never reach this > limitation. > > BR, > G > > > On Thu, Jan 13, 2022 at 1:00 PM 张俊帆 <zuston.sha...@gmail.com> wrote: > >> Hi G >> >> Thanks for your quick reply. I think reserving the config of >> *security.kerberos.fetch.delegation-token* >> and simplifying disable the token fetching is a good idea.By the way, >> maybe this should be added >> in the migration plan or intergation section in the FLIP-211. >> >> Besides, I have a question that the KDC will collapse when the cluster >> reached 200 nodes you described >> in the google doc. Do you have any attachment or reference to prove it? >> Because in our internal per-cluster, >> the nodes reaches > 1000 and KDC looks good. Do i missed or misunderstood >> something? Please correct me. >> >> Best >> JunFan. >> On Jan 13, 2022, 5:26 PM +0800, dev@flink.apache.org, wrote: >> > >> > >> https://docs.google.com/document/d/1JzMbQ1pCJsLVz8yHrCxroYMRP2GwGwvacLrGyaIx5Yc/edit?fbclid=IwAR0vfeJvAbEUSzHQAAJfnWTaX46L6o7LyXhMfBUCcPrNi-uXNgoOaI8PMDQ >> >