Thanks for giving options to fulfil the need. Users are looking for a solution where users can be identified on the whole cluster and restrict access to resources/actions. A good example for such an action is cancelling other users running jobs.
* SSL does provide mutual authentication but when authentication passed there is no user based on restrictions can be made. * The less problematic part is that generating/maintaining short time valid certificates would be a hard (that's the reason KDC like servers exist). Having long time valid certificates would widen the attack surface but since the first concern is there this is just a cosmetic issue. All in all using TLS certificates is not sufficient in these environments unfortunately. BR, G On Thu, Jun 3, 2021 at 12:49 PM Till Rohrmann <trohrm...@apache.org> wrote: > Thanks for the information Gabor. If it is about securing the > communication between the REST client and the REST server, then Flink > already supports enabling mutual SSL authentication [1]. Would this be > enough to secure the communication and to pass an audit? > > [1] > https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/security/security-ssl/#external--rest-connectivity > > Cheers, > Till > > On Thu, Jun 3, 2021 at 10:33 AM Gabor Somogyi <gabor.g.somo...@gmail.com> > wrote: > >> Hi Till, >> >> Since I'm working in security area 10+ years let me share my thought. >> I would like to emphasise there are experts better than me but I have some >> basics. >> The discussion is open and not trying to tell alone things... >> >> > I mean if an attacker can get access to one of the machines, then it >> should also be possible to obtain the right Kerberos token. >> Not necessarily. For example if one gets access to a specific user's >> credentials then it's not possible to compromise other user's jobs, data, >> etc... >> Security is like an onion, the more layers has been added the more time an >> attacker needs to proceed. >> At the end of the day if one is in, then most probably can find the way >> but >> this time is normally enough to sysadmins or security experts to >> close down the system and minimize the damage. >> >> The other thing is that all tokens has a timeout and if the token is >> invalid then the attacker can't proceed further. >> >> > Is Kerberos also the standard authentication protocol for Kubernetes >> deployments? >> Kerberos is an industry standard which is cloud/deployment agnostic and it >> can be used in any deployments including k8s. >> The main intention is to use kerberos in k8s deployments too since we're >> going this direction as well. >> Please see how Spark does this: >> >> https://spark.apache.org/docs/latest/security.html#secure-interaction-with-kubernetes >> >> Last but not least the most important reason to add at least one strong >> authentication is that we have users who has >> hard requirements on this. They're doing security audits and if they fail >> then it's deal breaking. >> That is why we have added kerberos at the first place. Unfortunately we >> can't name them in this public list, however >> the customers who specifically asked for this were mainly in the banking >> and telco sector. >> >> BR, >> G >> >> >> On Thu, Jun 3, 2021 at 9:20 AM Till Rohrmann <trohrm...@apache.org> >> wrote: >> >> > Thanks for updating the document Márton. Why is it that banks will >> > consider it more secure if Flink comes with Kerberos authentication >> > (assuming a properly secured setup)? I mean if an attacker can get >> access >> > to one of the machines, then it should also be possible to obtain the >> right >> > Kerberos token. >> > >> > I am not an authentication expert and that's why I wanted to ask what >> are >> > other authentication protocols other than Kerberos? Why did we select >> > Kerberos and not any other authentication protocol? Maybe you can list >> the >> > pros and cons for the different protocols. Is Kerberos also the standard >> > authentication protocol for Kubernetes deployments? If not, what would >> be >> > the answer when deploying on K8s? >> > >> > Cheers, >> > Till >> > >> > On Wed, Jun 2, 2021 at 12:07 PM Gabor Somogyi < >> gabor.g.somo...@gmail.com> >> > wrote: >> > >> >> Hi team, >> >> >> >> Happy to be here and hope I can provide quality additions in the >> future. >> >> >> >> Thank you all for helpful the suggestions! >> >> Considering them the FLIP has been modified and the work continues on >> the >> >> already existing Jira. >> >> >> >> BR, >> >> G >> >> >> >> >> >> On Wed, Jun 2, 2021 at 11:23 AM Márton Balassi < >> balassi.mar...@gmail.com> >> >> wrote: >> >> >> >>> Thanks, Chesney - I totally missed that. Answered on the ticket too, >> let >> >>> us continue there then. >> >>> >> >>> Till, I agree that we should keep this codepath as slim as possible. >> It >> >>> is an important design decision that we aim to keep the list of >> >>> authentication protocols to a minimum. We believe that this should >> not be a >> >>> primary concern of Flink and a trusted proxy service (for example >> Apache >> >>> Knox) should be used to enable a multitude of enduser authentication >> >>> mechanisms. The bare minimum of authentication mechanisms to support >> >>> consequently consist of a single strong authentication protocol for >> which >> >>> Kerberos is the enterprise solution and HTTP Basic primary for >> development >> >>> and light-weight scenarios. >> >>> >> >>> Added the above wording to G's doc. >> >>> >> >>> >> https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit >> >>> >> >>> >> >>> >> >>> On Tue, Jun 1, 2021 at 11:47 AM Chesnay Schepler <ches...@apache.org> >> >>> wrote: >> >>> >> >>>> There's a related effort: >> >>>> https://issues.apache.org/jira/browse/FLINK-21108 >> >>>> >> >>>> On 6/1/2021 10:14 AM, Till Rohrmann wrote: >> >>>> > Hi Gabor, welcome to the Flink community! >> >>>> > >> >>>> > Thanks for sharing this proposal with the community Márton. In >> >>>> general, I >> >>>> > agree that authentication is missing and that this is required for >> >>>> using >> >>>> > Flink within an enterprise. The thing I am wondering is whether >> this >> >>>> > feature strictly needs to be implemented inside of Flink or >> whether a >> >>>> proxy >> >>>> > setup could do the job? Have you considered this option? If yes, >> then >> >>>> it >> >>>> > would be good to list it under the point of rejected alternatives. >> >>>> > >> >>>> > I do see the benefit of implementing this feature inside of Flink >> if >> >>>> many >> >>>> > users need it. If not, then it might be easier for the project to >> not >> >>>> > increase the surface area since it makes the overall maintenance >> >>>> harder. >> >>>> > >> >>>> > Cheers, >> >>>> > Till >> >>>> > >> >>>> > On Mon, May 31, 2021 at 4:57 PM Márton Balassi < >> mbala...@apache.org> >> >>>> wrote: >> >>>> > >> >>>> >> Hi team, >> >>>> >> >> >>>> >> Firstly I would like to introduce Gabor or G [1] for short to the >> >>>> >> community, he is a Spark committer who has recently transitioned >> to >> >>>> the >> >>>> >> Flink Engineering team at Cloudera and is looking forward to >> >>>> contributing >> >>>> >> to Apache Flink. Previously G primarily focused on Spark Streaming >> >>>> and >> >>>> >> security. >> >>>> >> >> >>>> >> Based on requests from our customers G has implemented Kerberos >> and >> >>>> HTTP >> >>>> >> Basic Authentication for the Flink Dashboard and HistoryServer. >> >>>> Previously >> >>>> >> lacked an authentication story. >> >>>> >> >> >>>> >> We are looking to contribute this functionality back to the >> >>>> community, we >> >>>> >> believe that given Flink's maturity there should be a common code >> >>>> solution >> >>>> >> for this general pattern. >> >>>> >> >> >>>> >> We are looking forward to your feedback on G's design. [2] >> >>>> >> >> >>>> >> [1] http://gaborsomogyi.com/ >> >>>> >> [2] >> >>>> >> >> >>>> >> >> >>>> >> https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit >> >>>> >> >> >>>> >> >>>> >> >