Hi team, Thank you for your input. Based on this discussion I agree with G that selecting and standardizing on a specific strong authentication mechanism is more challenging than the whole rest of the scope of this authentication story. :-) I suggest that G and I go back to the drawing board and come up with an API that can support multiple authentication mechanisms, and we would only merge said API to Flink. Specific implementations of it can be maintained outside of the project. This way we tackle the main challenge in a truly minimal way.
Best, Marton On Mon, Jun 21, 2021 at 4:18 PM Gabor Somogyi <gabor.g.somo...@gmail.com> wrote: > Hi All, > > We see that adding any kind of specific authentication raises more > questions than answers. > What would be if a generic API would be added without any real > authentication logic? > That way every provider can add its own protocol implementation as > additional jar. > > BR, > G > > > On Thu, Jun 17, 2021 at 7:53 PM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > >> Hi all, >> >> Sorry to be joining the conversation late. I'm also on the side of >> Konstantin, generally, in that this seems to not be a core goal of Flink >> as >> a project and adds a maintenance burden. >> >> Would another con of Kerberos be that is likely a fading project in terms >> of network security? (serious question, please correct me if there is >> reason to believe it is gaining adoption) >> >> The point about Kerberos being independent of infrastructure is a good one >> but is something that is also solved by modern sidecar proxies + service >> meshes that can run across Kubernetes and bare-metal. These solutions also >> handle certificate provisioning, rotation, etc. in addition to >> higher-level >> authorization policies. Some examples of projects with this "universal >> infrastructure support" are Kuma[1] (CNCF Sandbox, I'm a maintainer) and >> Istio[2] (Google). >> >> Wondering out loud: has anyone tried to run Flink on top of cilium[3], >> which also provides zero-trust networking at the kernel level without >> needing to instrument applications? This currently only runs on Kubernetes >> on Linux, so that's a major limitation, but solves many of the request >> forging concerns at all levels. >> >> Thanks, >> Austin >> >> [1]: https://kuma.io/docs/1.1.6/quickstart/universal/ >> [2]: https://istio.io/latest/docs/setup/install/virtual-machine/ >> [3]: https://cilium.io/ >> >> On Thu, Jun 17, 2021 at 1:50 PM Till Rohrmann <trohrm...@apache.org> >> wrote: >> >> > I left some comments in the Google document. It would be great if >> > someone from the community with security experience could also take a >> look >> > at it. Maybe Eron you have an opinion on the topic. >> > >> > Cheers, >> > Till >> > >> > On Thu, Jun 17, 2021 at 6:57 PM Till Rohrmann <trohrm...@apache.org> >> > wrote: >> > >> > > Hi Gabor, >> > > >> > > I haven't found time to look into the updated FLIP yet. I'll try to >> do it >> > > asap. >> > > >> > > Cheers, >> > > Till >> > > >> > > On Wed, Jun 16, 2021 at 9:35 PM Konstantin Knauf <kna...@apache.org> >> > > wrote: >> > > >> > >> Hi Gabor, >> > >> >> > >> > However representing Kerberos as completely new feature is not true >> > >> because >> > >> it's already in since Flink makes authentication at least with HDFS >> and >> > >> Hbase through Kerberos. >> > >> >> > >> True, that is one way to look at it, but there are differences, too: >> > >> Control Plane vs Data Plane, Core vs Connectors. >> > >> >> > >> > Adding OIDC or OAuth2 has the exact same concerns what you've guys >> > just >> > >> raised. Why exactly these? If you think this would be beneficial we >> can >> > >> discuss it in detail >> > >> >> > >> That's exactly my point. Once we start adding authx support, we will >> > >> sooner or later discuss other options besides Kerberos, too. A user >> who >> > >> would like to use OAuth can not easily use Kerberos, right? >> > >> That is one of the reasons I am skeptical about adding initial authx >> > >> support. >> > >> >> > >> > Related authorization you've mentioned it can be complicated over >> > time. >> > >> Can >> > >> you show us an example? We've knowledge with couple of open source >> > >> components >> > >> but authorization was never a horror complex story. I personally have >> > the >> > >> most experience with Spark which I think is quite simple and stable. >> > Users >> > >> can be viewers/admins >> > >> and jobs started by others can't be modified. If you can share an >> > example >> > >> over-complication we can discuss on facts. >> > >> >> > >> Authorization is a new aspect that needs to be considered for every >> > >> addition to the REST API. In the future users might ask for >> additional >> > >> roles (e.g. an editor), user-defined roles and you've already >> mentioned >> > >> job-level permissions yourself. And keep in mind that there might >> also >> > be >> > >> larger additions in the future like the flink-sql-gateway. >> Contributions >> > >> like this become more expensive the more aspects we need to consider. >> > >> >> > >> In general, I believe, it is important that the community focuses its >> > >> efforts where we can generate the most value to the user and - >> > personally - >> > >> I don't think there is much to gain by extending Flink's scope in >> that >> > >> direction. Of course, this is not black and white and there are other >> > valid >> > >> opinions. >> > >> >> > >> Thanks, >> > >> >> > >> Konstantin >> > >> >> > >> On Wed, Jun 16, 2021 at 7:38 PM Gabor Somogyi < >> > gabor.g.somo...@gmail.com> >> > >> wrote: >> > >> >> > >>> Hi Konstantin, >> > >>> >> > >>> Thanks for the response. Related new feature introduction in case of >> > >>> Basic >> > >>> auth I tend to agree, anything else can be chosen. >> > >>> >> > >>> However representing Kerberos as completely new feature is not true >> > >>> because >> > >>> it's already in since Flink makes authentication at least with HDFS >> and >> > >>> Hbase through Kerberos. >> > >>> The main problem with the actual Kerberos implementation is that it >> > >>> contains several bugs and only partially implemented. Following your >> > >>> suggestion can we agree that we >> > >>> skip the Basic auth implementation and finish an already started >> > Kerberos >> > >>> story by adding History Server and Job Dashboard authentication? >> > >>> >> > >>> Adding OIDC or OAuth2 has the exact same concerns what you've guys >> just >> > >>> raised. Why exactly these? If you think this would be beneficial we >> can >> > >>> discuss it in detail >> > >>> but as a side story it would be good to finish a halfway done >> Kerberos >> > >>> story. >> > >>> >> > >>> Related authorization you've mentioned it can be complicated over >> time. >> > >>> Can >> > >>> you show us an example? We've knowledge with couple of open source >> > >>> components >> > >>> but authorization was never a horror complex story. I personally >> have >> > the >> > >>> most experience with Spark which I think is quite simple and stable. >> > >>> Users >> > >>> can be viewers/admins >> > >>> and jobs started by others can't be modified. If you can share an >> > example >> > >>> over-complication we can discuss on facts. >> > >>> >> > >>> Thank you in advance! >> > >>> >> > >>> BR, >> > >>> G >> > >>> >> > >>> >> > >>> On Wed, Jun 16, 2021 at 5:42 PM Konstantin Knauf <kna...@apache.org >> > >> > >>> wrote: >> > >>> >> > >>> > Hi everyone, >> > >>> > >> > >>> > sorry for joining late and thanks for the insightful discussion. >> > >>> > >> > >>> > In general, I'd personally prefer not to increase the surface >> area of >> > >>> > Apache Flink unless there is a good reason. It seems we all agree >> > that >> > >>> > authx is not part of the core value proposition of Apache Flink, >> so >> > if >> > >>> we >> > >>> > can delegate this problem to a more specialized tool, I am in >> favor >> > of >> > >>> > that. Apache Flink is already huge and a lot of work goes into >> > >>> maintenance, >> > >>> > so I personally have become more sensitive to this aspect over >> time. >> > >>> > >> > >>> > If we add support for Basic Auth and Kerberos now, users will >> sooner >> > or >> > >>> > later ask for OIDC, LDAP, SAML,... I acknowledge that Kerberos is >> > >>> widely >> > >>> > used in the corporate, on-premises context, but isn't the focus >> > moving >> > >>> more >> > >>> > towards more web-friendly standards like OIDC/OAuth 2.0? If we >> only >> > >>> want to >> > >>> > support a single protocol, there is an argument to be made that it >> > >>> should >> > >>> > be OIDC and Dex [1,2] as a bridge to everything else. Have OIDC or >> > >>> OAuth2 >> > >>> > been considered instead of Kerberos? How do you see the market >> > moving? >> > >>> But >> > >>> > as I said before, in my opinion we can generate more value by >> > investing >> > >>> > into other areas of Apache Flink. >> > >>> > >> > >>> > Authorization also has the potential to become more fine-grained >> and >> > >>> > complex over time: you already mentioned restricting the actions >> > that a >> > >>> > specific user can do in a cluster. >> > >>> > >> > >>> > Cheers, >> > >>> > >> > >>> > Konstantin >> > >>> > >> > >>> > [1] https://github.com/dexidp/dex >> > >>> > [2] https://github.com/dexidp/dex/issues/1903 >> > >>> > >> > >>> > >> > >>> > On Wed, Jun 16, 2021 at 11:44 AM Gabor Somogyi < >> > >>> gabor.g.somo...@gmail.com> >> > >>> > wrote: >> > >>> > >> > >>> >> Hi Till, >> > >>> >> >> > >>> >> Did you have the chance to take a look at the doc? Not yet seen >> any >> > >>> >> update. >> > >>> >> >> > >>> >> BR, >> > >>> >> G >> > >>> >> >> > >>> >> >> > >>> >> On Wed, Jun 9, 2021 at 1:43 PM Till Rohrmann < >> trohrm...@apache.org> >> > >>> >> wrote: >> > >>> >> >> > >>> >> > Thanks for the update Gabor. I'll take a look and respond in >> the >> > >>> >> document. >> > >>> >> > >> > >>> >> > Cheers, >> > >>> >> > Till >> > >>> >> > >> > >>> >> > On Wed, Jun 9, 2021 at 12:59 PM Gabor Somogyi < >> > >>> >> gabor.g.somo...@gmail.com> >> > >>> >> > wrote: >> > >>> >> > >> > >>> >> >> Hi Till, >> > >>> >> >> >> > >>> >> >> Your proxy suggestion has been considered in-depth and updated >> > the >> > >>> FLIP >> > >>> >> >> accordingly. >> > >>> >> >> We've considered 2 proxy implementation (Nginx and Squid) but >> > >>> according >> > >>> >> >> to our analysis and testing it's not suitable for the >> mentioned >> > >>> >> use-cases. >> > >>> >> >> Please take a look at the rejected alternatives for detailed >> > >>> >> explanation. >> > >>> >> >> >> > >>> >> >> Thanks for your time in advance! >> > >>> >> >> >> > >>> >> >> BR, >> > >>> >> >> G >> > >>> >> >> >> > >>> >> >> >> > >>> >> >> On Fri, Jun 4, 2021 at 3:31 PM Till Rohrmann < >> > trohrm...@apache.org >> > >>> > >> > >>> >> >> wrote: >> > >>> >> >> >> > >>> >> >>> As I've said I am not a security expert and that's why I >> have to >> > >>> ask >> > >>> >> for >> > >>> >> >>> clarification, Gabor. You are saying that if we configure a >> > >>> >> truststore for >> > >>> >> >>> the REST endpoint with a single trusted certificate which has >> > been >> > >>> >> >>> generated by the operator of the Flink cluster, then the >> > attacker >> > >>> can >> > >>> >> >>> generate a new certificate, sign it and then talk to the >> Flink >> > >>> >> cluster if >> > >>> >> >>> he has access to the node on which the REST endpoint runs? My >> > >>> >> understanding >> > >>> >> >>> was that you need the corresponding private key which in my >> > >>> proposed >> > >>> >> setup >> > >>> >> >>> would be under the control of the operator as well (e.g. >> stored >> > >>> in a >> > >>> >> >>> keystore on the same machine but guarded by some secret). >> That >> > way >> > >>> >> (if I am >> > >>> >> >>> not mistaken), only the entity which has access to the >> keystore >> > is >> > >>> >> able to >> > >>> >> >>> talk to the Flink cluster. >> > >>> >> >>> >> > >>> >> >>> Maybe we are also getting our wires crossed here and are >> talking >> > >>> about >> > >>> >> >>> different things. >> > >>> >> >>> >> > >>> >> >>> Thanks for listing the pros and cons of Kerberos. Concerning >> > what >> > >>> >> other >> > >>> >> >>> authentication mechanisms are used in the industry, I am not >> > 100% >> > >>> >> sure. >> > >>> >> >>> >> > >>> >> >>> Cheers, >> > >>> >> >>> Till >> > >>> >> >>> >> > >>> >> >>> On Fri, Jun 4, 2021 at 11:09 AM Gabor Somogyi < >> > >>> >> gabor.g.somo...@gmail.com> >> > >>> >> >>> wrote: >> > >>> >> >>> >> > >>> >> >>>> > I did not mean for the user to sign its own certificates >> but >> > >>> for >> > >>> >> the >> > >>> >> >>>> operator of the cluster. Once the user request hits the >> proxy, >> > it >> > >>> >> should no >> > >>> >> >>>> longer be under his control. I think I do not fully >> understand >> > >>> yet >> > >>> >> why this >> > >>> >> >>>> would not work. >> > >>> >> >>>> I said it's not solving the authentication problem over any >> > >>> proxy. >> > >>> >> Even >> > >>> >> >>>> if the operator is signing the certificate one can have >> access >> > >>> to an >> > >>> >> >>>> internal node. >> > >>> >> >>>> Such case anybody can craft certificates which is accepted >> by >> > the >> > >>> >> >>>> server. When it's accepted a bad guy can cancel jobs causing >> > huge >> > >>> >> impacts. >> > >>> >> >>>> >> > >>> >> >>>> > Also, I am missing a bit the comparison of Kerberos to >> other >> > >>> >> >>>> authentication mechanisms and why they were rejected in >> favour >> > of >> > >>> >> Kerberos. >> > >>> >> >>>> PROS: >> > >>> >> >>>> * Since it's not depending on cloud provider and/or k8s or >> > >>> bare-metal >> > >>> >> >>>> etc. deployment it's the biggest plus >> > >>> >> >>>> * Centralized with tools and no need to write tons of tools >> > >>> around >> > >>> >> >>>> * There are clients/tools on almost all OS-es and several >> > >>> languages >> > >>> >> >>>> * Super huge users are using it for years in production w/o >> > huge >> > >>> >> issues >> > >>> >> >>>> * Provides cross-realm trust possibility amongst other >> features >> > >>> >> >>>> * Several open source components using it which could >> increase >> > >>> >> >>>> compatibility >> > >>> >> >>>> >> > >>> >> >>>> CONS: >> > >>> >> >>>> * Not everybody using kerberos >> > >>> >> >>>> * It would increase the code footprint but this is true for >> > many >> > >>> >> >>>> features (as a side note I'm here to maintain it) >> > >>> >> >>>> >> > >>> >> >>>> Feel free to add your points because it only represents a >> > single >> > >>> >> >>>> viewpoint. >> > >>> >> >>>> Also if you have any better option for strong authentication >> > >>> please >> > >>> >> >>>> share it and we can consider the pros/cons here. >> > >>> >> >>>> >> > >>> >> >>>> BR, >> > >>> >> >>>> G >> > >>> >> >>>> >> > >>> >> >>>> >> > >>> >> >>>> On Fri, Jun 4, 2021 at 10:32 AM Till Rohrmann < >> > >>> trohrm...@apache.org> >> > >>> >> >>>> wrote: >> > >>> >> >>>> >> > >>> >> >>>>> I did not mean for the user to sign its own certificates >> but >> > >>> for the >> > >>> >> >>>>> operator of the cluster. Once the user request hits the >> proxy, >> > >>> it >> > >>> >> should no >> > >>> >> >>>>> longer be under his control. I think I do not fully >> understand >> > >>> yet >> > >>> >> why this >> > >>> >> >>>>> would not work. >> > >>> >> >>>>> >> > >>> >> >>>>> What I would like to avoid is to add more complexity into >> > Flink >> > >>> if >> > >>> >> >>>>> there is an easy solution which fulfills the requirements. >> > >>> That's >> > >>> >> why I >> > >>> >> >>>>> would like to exercise thoroughly through the different >> > >>> >> alternatives. Also, >> > >>> >> >>>>> I am missing a bit the comparison of Kerberos to other >> > >>> >> authentication >> > >>> >> >>>>> mechanisms and why they were rejected in favour of >> Kerberos. >> > >>> >> >>>>> >> > >>> >> >>>>> Cheers, >> > >>> >> >>>>> Till >> > >>> >> >>>>> >> > >>> >> >>>>> On Fri, Jun 4, 2021 at 10:26 AM Gyula Fóra < >> gyf...@apache.org >> > > >> > >>> >> wrote: >> > >>> >> >>>>> >> > >>> >> >>>>>> Hi! >> > >>> >> >>>>>> >> > >>> >> >>>>>> I think there might be possible alternatives but it seems >> > >>> Kerberos >> > >>> >> on >> > >>> >> >>>>>> the rest endpoint ticks all the right boxes and provides a >> > >>> super >> > >>> >> clean and >> > >>> >> >>>>>> simple solution for strong authentication. >> > >>> >> >>>>>> >> > >>> >> >>>>>> I wouldn’t even consider sidecar proxies etc if we can >> solve >> > >>> it in >> > >>> >> >>>>>> such a simple way as proposed by G. >> > >>> >> >>>>>> >> > >>> >> >>>>>> Cheers >> > >>> >> >>>>>> Gyula >> > >>> >> >>>>>> >> > >>> >> >>>>>> On Fri, 4 Jun 2021 at 10:03, Till Rohrmann < >> > >>> trohrm...@apache.org> >> > >>> >> >>>>>> wrote: >> > >>> >> >>>>>> >> > >>> >> >>>>>>> I am not saying that we shouldn't add a strong >> > authentication >> > >>> >> >>>>>>> mechanism if there are good reasons for it. I primarily >> > would >> > >>> >> like to >> > >>> >> >>>>>>> understand the context a bit better in order to give >> > qualified >> > >>> >> feedback and >> > >>> >> >>>>>>> come to a good decision. In order to do this, I have the >> > >>> feeling >> > >>> >> that we >> > >>> >> >>>>>>> haven't fully considered all available options which are >> on >> > >>> the >> > >>> >> table, tbh. >> > >>> >> >>>>>>> >> > >>> >> >>>>>>> Does the problem of certificate expiry also apply for >> > >>> self-signed >> > >>> >> >>>>>>> certificates? If yes, then this should then also be a >> > problem >> > >>> for >> > >>> >> the >> > >>> >> >>>>>>> internal encryption of Flink's communication. If not, >> then >> > one >> > >>> >> could use >> > >>> >> >>>>>>> self-signed certificates with a longer validity to solve >> the >> > >>> >> mentioned >> > >>> >> >>>>>>> issue. >> > >>> >> >>>>>>> >> > >>> >> >>>>>>> I think you can set up Flink in such a way that you don't >> > >>> have to >> > >>> >> >>>>>>> handle all the different certificates. For example, you >> > could >> > >>> >> deploy Flink >> > >>> >> >>>>>>> with a "sidecar proxy" which is responsible for the >> > >>> >> authentication using an >> > >>> >> >>>>>>> arbitrary method (e.g. Kerberos) and then bind the REST >> > >>> endpoint >> > >>> >> to a local >> > >>> >> >>>>>>> network interface. That way, the REST endpoint would >> only be >> > >>> >> available >> > >>> >> >>>>>>> through the sidecar proxy. Additionally, one could enable >> > SSL >> > >>> for >> > >>> >> this >> > >>> >> >>>>>>> communication. Would this be a solution for the problem? >> > >>> >> >>>>>>> >> > >>> >> >>>>>>> Cheers, >> > >>> >> >>>>>>> Till >> > >>> >> >>>>>>> >> > >>> >> >>>>>>> On Thu, Jun 3, 2021 at 10:46 PM Márton Balassi < >> > >>> >> >>>>>>> balassi.mar...@gmail.com> wrote: >> > >>> >> >>>>>>> >> > >>> >> >>>>>>>> That is an interesting idea, Till. >> > >>> >> >>>>>>>> >> > >>> >> >>>>>>>> The main issue with it is that TLS certificates have an >> > >>> >> expiration >> > >>> >> >>>>>>>> time, usually they get approved for a couple years. >> Forcing >> > >>> our >> > >>> >> users to >> > >>> >> >>>>>>>> restart jobs to reprovision TLS certificates would be >> weird >> > >>> when >> > >>> >> we could >> > >>> >> >>>>>>>> just implement a single proper strong authentication >> > >>> mechanism >> > >>> >> instead in a >> > >>> >> >>>>>>>> couple hundred lines of code. :-) >> > >>> >> >>>>>>>> >> > >>> >> >>>>>>>> In many cases it is also impractical to go the TLS >> mutual >> > >>> route, >> > >>> >> >>>>>>>> because the Flink Dashboard can end up on any node in >> the >> > >>> >> k8s/Yarn cluster >> > >>> >> >>>>>>>> which means that we need a certificate per node (due to >> the >> > >>> >> mutual auth), >> > >>> >> >>>>>>>> but if we also want to protect the private key of these >> > from >> > >>> >> users >> > >>> >> >>>>>>>> accidentally or intentionally leaking them then we need >> > this >> > >>> per >> > >>> >> user. As >> > >>> >> >>>>>>>> in we end up managing user*machine number certificates >> and >> > >>> >> having to renew >> > >>> >> >>>>>>>> them periodically, which albeit automatable is >> > unfortunately >> > >>> not >> > >>> >> yet >> > >>> >> >>>>>>>> automated in all large organizations. >> > >>> >> >>>>>>>> >> > >>> >> >>>>>>>> I fully agree that TLS certificate mutual authentication >> > has >> > >>> its >> > >>> >> >>>>>>>> nice properties, especially at very large (multiple >> > thousand >> > >>> >> node) clusters >> > >>> >> >>>>>>>> - but it has its own challenges too. Thanks for >> bringing it >> > >>> up. >> > >>> >> >>>>>>>> >> > >>> >> >>>>>>>> Happy to have this added to the rejected alternative >> list >> > so >> > >>> that >> > >>> >> >>>>>>>> we have the full picture documented. >> > >>> >> >>>>>>>> >> > >>> >> >>>>>>>> On Thu, Jun 3, 2021 at 5:52 PM Till Rohrmann < >> > >>> >> trohrm...@apache.org> >> > >>> >> >>>>>>>> wrote: >> > >>> >> >>>>>>>> >> > >>> >> >>>>>>>>> I guess the idea would then be to let the proxy do the >> > >>> >> >>>>>>>>> authentication job and only forward the request via an >> SSL >> > >>> >> mutually >> > >>> >> >>>>>>>>> encrypted connection to the Flink cluster. Would this >> be >> > >>> >> possible? The >> > >>> >> >>>>>>>>> beauty of this setup is in my opinion that this setup >> > should >> > >>> >> work with all >> > >>> >> >>>>>>>>> kinds of authentication mechanisms. >> > >>> >> >>>>>>>>> >> > >>> >> >>>>>>>>> Cheers, >> > >>> >> >>>>>>>>> Till >> > >>> >> >>>>>>>>> >> > >>> >> >>>>>>>>> On Thu, Jun 3, 2021 at 3:12 PM Gabor Somogyi < >> > >>> >> >>>>>>>>> gabor.g.somo...@gmail.com> wrote: >> > >>> >> >>>>>>>>> >> > >>> >> >>>>>>>>>> Thanks for giving options to fulfil the need. >> > >>> >> >>>>>>>>>> >> > >>> >> >>>>>>>>>> Users are looking for a solution where users can be >> > >>> identified >> > >>> >> on >> > >>> >> >>>>>>>>>> the whole cluster and restrict access to >> > resources/actions. >> > >>> >> >>>>>>>>>> A good example for such an action is cancelling other >> > users >> > >>> >> >>>>>>>>>> running jobs. >> > >>> >> >>>>>>>>>> >> > >>> >> >>>>>>>>>> * SSL does provide mutual authentication but when >> > >>> >> authentication >> > >>> >> >>>>>>>>>> passed there is no user based on restrictions can be >> > made. >> > >>> >> >>>>>>>>>> * The less problematic part is that >> > generating/maintaining >> > >>> >> short >> > >>> >> >>>>>>>>>> time valid certificates would be a hard (that's the >> > reason >> > >>> KDC >> > >>> >> like servers >> > >>> >> >>>>>>>>>> exist). >> > >>> >> >>>>>>>>>> Having long time valid certificates would widen the >> > attack >> > >>> >> >>>>>>>>>> surface but since the first concern is there this is >> > just a >> > >>> >> cosmetic issue. >> > >>> >> >>>>>>>>>> >> > >>> >> >>>>>>>>>> All in all using TLS certificates is not sufficient in >> > >>> these >> > >>> >> >>>>>>>>>> environments unfortunately. >> > >>> >> >>>>>>>>>> >> > >>> >> >>>>>>>>>> BR, >> > >>> >> >>>>>>>>>> G >> > >>> >> >>>>>>>>>> >> > >>> >> >>>>>>>>>> >> > >>> >> >>>>>>>>>> On Thu, Jun 3, 2021 at 12:49 PM Till Rohrmann < >> > >>> >> >>>>>>>>>> trohrm...@apache.org> wrote: >> > >>> >> >>>>>>>>>> >> > >>> >> >>>>>>>>>>> Thanks for the information Gabor. If it is about >> > securing >> > >>> the >> > >>> >> >>>>>>>>>>> communication between the REST client and the REST >> > server, >> > >>> >> then Flink >> > >>> >> >>>>>>>>>>> already supports enabling mutual SSL authentication >> [1]. >> > >>> >> Would this be >> > >>> >> >>>>>>>>>>> enough to secure the communication and to pass an >> audit? >> > >>> >> >>>>>>>>>>> >> > >>> >> >>>>>>>>>>> [1] >> > >>> >> >>>>>>>>>>> >> > >>> >> >> > >>> >> > >> https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/security/security-ssl/#external--rest-connectivity >> > >>> >> >>>>>>>>>>> >> > >>> >> >>>>>>>>>>> Cheers, >> > >>> >> >>>>>>>>>>> Till >> > >>> >> >>>>>>>>>>> >> > >>> >> >>>>>>>>>>> On Thu, Jun 3, 2021 at 10:33 AM Gabor Somogyi < >> > >>> >> >>>>>>>>>>> gabor.g.somo...@gmail.com> wrote: >> > >>> >> >>>>>>>>>>> >> > >>> >> >>>>>>>>>>>> Hi Till, >> > >>> >> >>>>>>>>>>>> >> > >>> >> >>>>>>>>>>>> Since I'm working in security area 10+ years let me >> > >>> share my >> > >>> >> >>>>>>>>>>>> thought. >> > >>> >> >>>>>>>>>>>> I would like to emphasise there are experts better >> than >> > >>> me >> > >>> >> but >> > >>> >> >>>>>>>>>>>> I have some >> > >>> >> >>>>>>>>>>>> basics. >> > >>> >> >>>>>>>>>>>> The discussion is open and not trying to tell alone >> > >>> things... >> > >>> >> >>>>>>>>>>>> >> > >>> >> >>>>>>>>>>>> > I mean if an attacker can get access to one of the >> > >>> >> machines, >> > >>> >> >>>>>>>>>>>> then it >> > >>> >> >>>>>>>>>>>> should also be possible to obtain the right Kerberos >> > >>> token. >> > >>> >> >>>>>>>>>>>> Not necessarily. For example if one gets access to a >> > >>> specific >> > >>> >> >>>>>>>>>>>> user's >> > >>> >> >>>>>>>>>>>> credentials then it's not possible to compromise >> other >> > >>> user's >> > >>> >> >>>>>>>>>>>> jobs, data, >> > >>> >> >>>>>>>>>>>> etc... >> > >>> >> >>>>>>>>>>>> Security is like an onion, the more layers has been >> > >>> added the >> > >>> >> >>>>>>>>>>>> more time an >> > >>> >> >>>>>>>>>>>> attacker needs to proceed. >> > >>> >> >>>>>>>>>>>> At the end of the day if one is in, then most >> probably >> > >>> can >> > >>> >> find >> > >>> >> >>>>>>>>>>>> the way but >> > >>> >> >>>>>>>>>>>> this time is normally enough to sysadmins or >> security >> > >>> >> experts to >> > >>> >> >>>>>>>>>>>> close down the system and minimize the damage. >> > >>> >> >>>>>>>>>>>> >> > >>> >> >>>>>>>>>>>> The other thing is that all tokens has a timeout >> and if >> > >>> the >> > >>> >> >>>>>>>>>>>> token is >> > >>> >> >>>>>>>>>>>> invalid then the attacker can't proceed further. >> > >>> >> >>>>>>>>>>>> >> > >>> >> >>>>>>>>>>>> > Is Kerberos also the standard authentication >> protocol >> > >>> for >> > >>> >> >>>>>>>>>>>> Kubernetes >> > >>> >> >>>>>>>>>>>> deployments? >> > >>> >> >>>>>>>>>>>> Kerberos is an industry standard which is >> > >>> cloud/deployment >> > >>> >> >>>>>>>>>>>> agnostic and it >> > >>> >> >>>>>>>>>>>> can be used in any deployments including k8s. >> > >>> >> >>>>>>>>>>>> The main intention is to use kerberos in k8s >> > deployments >> > >>> too >> > >>> >> >>>>>>>>>>>> since we're >> > >>> >> >>>>>>>>>>>> going this direction as well. >> > >>> >> >>>>>>>>>>>> Please see how Spark does this: >> > >>> >> >>>>>>>>>>>> >> > >>> >> >>>>>>>>>>>> >> > >>> >> >> > >>> >> > >> https://spark.apache.org/docs/latest/security.html#secure-interaction-with-kubernetes >> > >>> >> >>>>>>>>>>>> >> > >>> >> >>>>>>>>>>>> Last but not least the most important reason to add >> at >> > >>> least >> > >>> >> >>>>>>>>>>>> one strong >> > >>> >> >>>>>>>>>>>> authentication is that we have users who has >> > >>> >> >>>>>>>>>>>> hard requirements on this. They're doing security >> > audits >> > >>> and >> > >>> >> if >> > >>> >> >>>>>>>>>>>> they fail >> > >>> >> >>>>>>>>>>>> then it's deal breaking. >> > >>> >> >>>>>>>>>>>> That is why we have added kerberos at the first >> place. >> > >>> >> >>>>>>>>>>>> Unfortunately we >> > >>> >> >>>>>>>>>>>> can't name them in this public list, however >> > >>> >> >>>>>>>>>>>> the customers who specifically asked for this were >> > >>> mainly in >> > >>> >> >>>>>>>>>>>> the banking >> > >>> >> >>>>>>>>>>>> and telco sector. >> > >>> >> >>>>>>>>>>>> >> > >>> >> >>>>>>>>>>>> BR, >> > >>> >> >>>>>>>>>>>> G >> > >>> >> >>>>>>>>>>>> >> > >>> >> >>>>>>>>>>>> >> > >>> >> >>>>>>>>>>>> On Thu, Jun 3, 2021 at 9:20 AM Till Rohrmann < >> > >>> >> >>>>>>>>>>>> trohrm...@apache.org> wrote: >> > >>> >> >>>>>>>>>>>> >> > >>> >> >>>>>>>>>>>> > Thanks for updating the document Márton. Why is it >> > that >> > >>> >> banks >> > >>> >> >>>>>>>>>>>> will >> > >>> >> >>>>>>>>>>>> > consider it more secure if Flink comes with >> Kerberos >> > >>> >> >>>>>>>>>>>> authentication >> > >>> >> >>>>>>>>>>>> > (assuming a properly secured setup)? I mean if an >> > >>> attacker >> > >>> >> >>>>>>>>>>>> can get access >> > >>> >> >>>>>>>>>>>> > to one of the machines, then it should also be >> > >>> possible to >> > >>> >> >>>>>>>>>>>> obtain the right >> > >>> >> >>>>>>>>>>>> > Kerberos token. >> > >>> >> >>>>>>>>>>>> > >> > >>> >> >>>>>>>>>>>> > I am not an authentication expert and that's why I >> > >>> wanted >> > >>> >> to >> > >>> >> >>>>>>>>>>>> ask what are >> > >>> >> >>>>>>>>>>>> > other authentication protocols other than >> Kerberos? >> > >>> Why did >> > >>> >> >>>>>>>>>>>> we select >> > >>> >> >>>>>>>>>>>> > Kerberos and not any other authentication >> protocol? >> > >>> Maybe >> > >>> >> you >> > >>> >> >>>>>>>>>>>> can list the >> > >>> >> >>>>>>>>>>>> > pros and cons for the different protocols. Is >> > Kerberos >> > >>> also >> > >>> >> >>>>>>>>>>>> the standard >> > >>> >> >>>>>>>>>>>> > authentication protocol for Kubernetes >> deployments? >> > If >> > >>> not, >> > >>> >> >>>>>>>>>>>> what would be >> > >>> >> >>>>>>>>>>>> > the answer when deploying on K8s? >> > >>> >> >>>>>>>>>>>> > >> > >>> >> >>>>>>>>>>>> > Cheers, >> > >>> >> >>>>>>>>>>>> > Till >> > >>> >> >>>>>>>>>>>> > >> > >>> >> >>>>>>>>>>>> > On Wed, Jun 2, 2021 at 12:07 PM Gabor Somogyi < >> > >>> >> >>>>>>>>>>>> gabor.g.somo...@gmail.com> >> > >>> >> >>>>>>>>>>>> > wrote: >> > >>> >> >>>>>>>>>>>> > >> > >>> >> >>>>>>>>>>>> >> Hi team, >> > >>> >> >>>>>>>>>>>> >> >> > >>> >> >>>>>>>>>>>> >> Happy to be here and hope I can provide quality >> > >>> additions >> > >>> >> in >> > >>> >> >>>>>>>>>>>> the future. >> > >>> >> >>>>>>>>>>>> >> >> > >>> >> >>>>>>>>>>>> >> Thank you all for helpful the suggestions! >> > >>> >> >>>>>>>>>>>> >> Considering them the FLIP has been modified and >> the >> > >>> work >> > >>> >> >>>>>>>>>>>> continues on the >> > >>> >> >>>>>>>>>>>> >> already existing Jira. >> > >>> >> >>>>>>>>>>>> >> >> > >>> >> >>>>>>>>>>>> >> BR, >> > >>> >> >>>>>>>>>>>> >> G >> > >>> >> >>>>>>>>>>>> >> >> > >>> >> >>>>>>>>>>>> >> >> > >>> >> >>>>>>>>>>>> >> On Wed, Jun 2, 2021 at 11:23 AM Márton Balassi < >> > >>> >> >>>>>>>>>>>> balassi.mar...@gmail.com> >> > >>> >> >>>>>>>>>>>> >> wrote: >> > >>> >> >>>>>>>>>>>> >> >> > >>> >> >>>>>>>>>>>> >>> Thanks, Chesney - I totally missed that. >> Answered >> > on >> > >>> the >> > >>> >> >>>>>>>>>>>> ticket too, let >> > >>> >> >>>>>>>>>>>> >>> us continue there then. >> > >>> >> >>>>>>>>>>>> >>> >> > >>> >> >>>>>>>>>>>> >>> Till, I agree that we should keep this codepath >> as >> > >>> slim >> > >>> >> as >> > >>> >> >>>>>>>>>>>> possible. It >> > >>> >> >>>>>>>>>>>> >>> is an important design decision that we aim to >> keep >> > >>> the >> > >>> >> >>>>>>>>>>>> list of >> > >>> >> >>>>>>>>>>>> >>> authentication protocols to a minimum. We >> believe >> > >>> that >> > >>> >> this >> > >>> >> >>>>>>>>>>>> should not be a >> > >>> >> >>>>>>>>>>>> >>> primary concern of Flink and a trusted proxy >> > service >> > >>> (for >> > >>> >> >>>>>>>>>>>> example Apache >> > >>> >> >>>>>>>>>>>> >>> Knox) should be used to enable a multitude of >> > enduser >> > >>> >> >>>>>>>>>>>> authentication >> > >>> >> >>>>>>>>>>>> >>> mechanisms. The bare minimum of authentication >> > >>> mechanisms >> > >>> >> >>>>>>>>>>>> to support >> > >>> >> >>>>>>>>>>>> >>> consequently consist of a single strong >> > >>> authentication >> > >>> >> >>>>>>>>>>>> protocol for which >> > >>> >> >>>>>>>>>>>> >>> Kerberos is the enterprise solution and HTTP >> Basic >> > >>> >> primary >> > >>> >> >>>>>>>>>>>> for development >> > >>> >> >>>>>>>>>>>> >>> and light-weight scenarios. >> > >>> >> >>>>>>>>>>>> >>> >> > >>> >> >>>>>>>>>>>> >>> Added the above wording to G's doc. >> > >>> >> >>>>>>>>>>>> >>> >> > >>> >> >>>>>>>>>>>> >>> >> > >>> >> >>>>>>>>>>>> >> > >>> >> >> > >>> >> > >> https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit >> > >>> >> >>>>>>>>>>>> >>> >> > >>> >> >>>>>>>>>>>> >>> >> > >>> >> >>>>>>>>>>>> >>> >> > >>> >> >>>>>>>>>>>> >>> On Tue, Jun 1, 2021 at 11:47 AM Chesnay >> Schepler < >> > >>> >> >>>>>>>>>>>> ches...@apache.org> >> > >>> >> >>>>>>>>>>>> >>> wrote: >> > >>> >> >>>>>>>>>>>> >>> >> > >>> >> >>>>>>>>>>>> >>>> There's a related effort: >> > >>> >> >>>>>>>>>>>> >>>> >> https://issues.apache.org/jira/browse/FLINK-21108 >> > >>> >> >>>>>>>>>>>> >>>> >> > >>> >> >>>>>>>>>>>> >>>> On 6/1/2021 10:14 AM, Till Rohrmann wrote: >> > >>> >> >>>>>>>>>>>> >>>> > Hi Gabor, welcome to the Flink community! >> > >>> >> >>>>>>>>>>>> >>>> > >> > >>> >> >>>>>>>>>>>> >>>> > Thanks for sharing this proposal with the >> > >>> community >> > >>> >> >>>>>>>>>>>> Márton. In >> > >>> >> >>>>>>>>>>>> >>>> general, I >> > >>> >> >>>>>>>>>>>> >>>> > agree that authentication is missing and that >> > >>> this is >> > >>> >> >>>>>>>>>>>> required for >> > >>> >> >>>>>>>>>>>> >>>> using >> > >>> >> >>>>>>>>>>>> >>>> > Flink within an enterprise. The thing I am >> > >>> wondering >> > >>> >> is >> > >>> >> >>>>>>>>>>>> whether this >> > >>> >> >>>>>>>>>>>> >>>> > feature strictly needs to be implemented >> inside >> > of >> > >>> >> Flink >> > >>> >> >>>>>>>>>>>> or whether a >> > >>> >> >>>>>>>>>>>> >>>> proxy >> > >>> >> >>>>>>>>>>>> >>>> > setup could do the job? Have you considered >> this >> > >>> >> option? >> > >>> >> >>>>>>>>>>>> If yes, then >> > >>> >> >>>>>>>>>>>> >>>> it >> > >>> >> >>>>>>>>>>>> >>>> > would be good to list it under the point of >> > >>> rejected >> > >>> >> >>>>>>>>>>>> alternatives. >> > >>> >> >>>>>>>>>>>> >>>> > >> > >>> >> >>>>>>>>>>>> >>>> > I do see the benefit of implementing this >> > feature >> > >>> >> inside >> > >>> >> >>>>>>>>>>>> of Flink if >> > >>> >> >>>>>>>>>>>> >>>> many >> > >>> >> >>>>>>>>>>>> >>>> > users need it. If not, then it might be >> easier >> > >>> for the >> > >>> >> >>>>>>>>>>>> project to not >> > >>> >> >>>>>>>>>>>> >>>> > increase the surface area since it makes the >> > >>> overall >> > >>> >> >>>>>>>>>>>> maintenance >> > >>> >> >>>>>>>>>>>> >>>> harder. >> > >>> >> >>>>>>>>>>>> >>>> > >> > >>> >> >>>>>>>>>>>> >>>> > Cheers, >> > >>> >> >>>>>>>>>>>> >>>> > Till >> > >>> >> >>>>>>>>>>>> >>>> > >> > >>> >> >>>>>>>>>>>> >>>> > On Mon, May 31, 2021 at 4:57 PM Márton >> Balassi < >> > >>> >> >>>>>>>>>>>> mbala...@apache.org> >> > >>> >> >>>>>>>>>>>> >>>> wrote: >> > >>> >> >>>>>>>>>>>> >>>> > >> > >>> >> >>>>>>>>>>>> >>>> >> Hi team, >> > >>> >> >>>>>>>>>>>> >>>> >> >> > >>> >> >>>>>>>>>>>> >>>> >> Firstly I would like to introduce Gabor or G >> > [1] >> > >>> for >> > >>> >> >>>>>>>>>>>> short to the >> > >>> >> >>>>>>>>>>>> >>>> >> community, he is a Spark committer who has >> > >>> recently >> > >>> >> >>>>>>>>>>>> transitioned to >> > >>> >> >>>>>>>>>>>> >>>> the >> > >>> >> >>>>>>>>>>>> >>>> >> Flink Engineering team at Cloudera and is >> > looking >> > >>> >> >>>>>>>>>>>> forward to >> > >>> >> >>>>>>>>>>>> >>>> contributing >> > >>> >> >>>>>>>>>>>> >>>> >> to Apache Flink. Previously G primarily >> focused >> > >>> on >> > >>> >> >>>>>>>>>>>> Spark Streaming >> > >>> >> >>>>>>>>>>>> >>>> and >> > >>> >> >>>>>>>>>>>> >>>> >> security. >> > >>> >> >>>>>>>>>>>> >>>> >> >> > >>> >> >>>>>>>>>>>> >>>> >> Based on requests from our customers G has >> > >>> >> implemented >> > >>> >> >>>>>>>>>>>> Kerberos and >> > >>> >> >>>>>>>>>>>> >>>> HTTP >> > >>> >> >>>>>>>>>>>> >>>> >> Basic Authentication for the Flink Dashboard >> > and >> > >>> >> >>>>>>>>>>>> HistoryServer. >> > >>> >> >>>>>>>>>>>> >>>> Previously >> > >>> >> >>>>>>>>>>>> >>>> >> lacked an authentication story. >> > >>> >> >>>>>>>>>>>> >>>> >> >> > >>> >> >>>>>>>>>>>> >>>> >> We are looking to contribute this >> functionality >> > >>> back >> > >>> >> to >> > >>> >> >>>>>>>>>>>> the >> > >>> >> >>>>>>>>>>>> >>>> community, we >> > >>> >> >>>>>>>>>>>> >>>> >> believe that given Flink's maturity there >> > should >> > >>> be a >> > >>> >> >>>>>>>>>>>> common code >> > >>> >> >>>>>>>>>>>> >>>> solution >> > >>> >> >>>>>>>>>>>> >>>> >> for this general pattern. >> > >>> >> >>>>>>>>>>>> >>>> >> >> > >>> >> >>>>>>>>>>>> >>>> >> We are looking forward to your feedback on >> G's >> > >>> >> design. >> > >>> >> >>>>>>>>>>>> [2] >> > >>> >> >>>>>>>>>>>> >>>> >> >> > >>> >> >>>>>>>>>>>> >>>> >> [1] http://gaborsomogyi.com/ >> > >>> >> >>>>>>>>>>>> >>>> >> [2] >> > >>> >> >>>>>>>>>>>> >>>> >> >> > >>> >> >>>>>>>>>>>> >>>> >> >> > >>> >> >>>>>>>>>>>> >>>> >> > >>> >> >>>>>>>>>>>> >> > >>> >> >> > >>> >> > >> https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit >> > >>> >> >>>>>>>>>>>> >>>> >> >> > >>> >> >>>>>>>>>>>> >>>> >> > >>> >> >>>>>>>>>>>> >>>> >> > >>> >> >>>>>>>>>>>> >> > >>> >> >>>>>>>>>>> >> > >>> >> >> > >>> > >> > >>> > >> > >>> > -- >> > >>> > >> > >>> > Konstantin Knauf >> > >>> > >> > >>> > https://twitter.com/snntrable >> > >>> > >> > >>> > https://github.com/knaufk >> > >>> > >> > >>> >> > >> >> > >> >> > >> -- >> > >> >> > >> Konstantin Knauf >> > >> >> > >> https://twitter.com/snntrable >> > >> >> > >> https://github.com/knaufk >> > >> >> > > >> > >> >