Hi Jack,

Here's my comments:

1. I don't think we should remove the oauth2 endpoint directly like
this. I would first deprecate the endpoint and plan the removal in the
spec v2.
4. I agree, and it has to be pluggable.

I updated the REST Spec v2 proposal including first steps on v1:
https://docs.google.com/document/d/1JUtFpdEoa6IAKt1EzJi_re0PUbh56XnfUtRe5WAfl0s/edit?usp=sharing

As already shared on the mailing list, I'm working on a PR to have
interfaces with JAXRS/Swagger annotations to generate OpenAPI
JSON/YAML with the swagger-gradle-plugin.

Thanks,
Regards
JB

On Wed, May 29, 2024 at 8:03 PM Jack Ye <yezhao...@gmail.com> wrote:
>
> Just to reiterate my points discussed in the community sync here: the more I 
> think about it the more I agree the OAuth endpoint should be removed from the 
> REST spec. Even though the endpoint is optional, and even if we do not care 
> about the security concerns, it still provides users an impression that the 
> endpoint "should" be implemented, or "is the preferred authentication 
> mechanism". And as we have found out, the server capability proposal does not 
> cover this case since this is the first endpoint to hit before the GetConfig 
> endpoint.
>
> As Ryan said, if we want to do that we need an alternative plan. I don't have 
> anything concrete, but here is my line of thought:
>
> 1. remove OAuth2 endpoint from the "REST OpenAPI spec"
>
> 2. create a client-side interface (in each language) that different 
> authentication mechanisms can be plugged in to talk to the REST catalog
>
> 3. refactor and make OAuth2 an implementation of that interface. I can also 
> help with doing the same for AWS Sigv4, and the community can further support 
> some additional ones like Kerberos, SAML, Google SSO, etc. based on the 
> individual use cases.
>
> 4. turn 2 + 3 into a "REST catalog authentication spec" that documents about 
> all the supported authentication mechanisms and their defaults. For OAuth2, 
> the default is to have the auth server at the same endpoint as the resource 
> server for backwards compatibility, but that is a configurable property, and 
> we could recommend not to do that based on security concerns.
>
> Best,
> Jack Ye
>
> On Wed, May 29, 2024 at 10:28 AM Steven Wu <stevenz...@gmail.com> wrote:
>>
>> Wondering if the auth endpoints can be separated out to a separate OpenAPI 
>> spec file. Then we still have some reference for interactions with auth 
>> server and make it clear it is not required as part of the REST catalog 
>> server. In most enterprise environments, auth server is likely a separate 
>> server.
>>
>> On Tue, May 28, 2024 at 1:25 PM Alex Dutra <alex.du...@dremio.com.invalid> 
>> wrote:
>>>
>>> Hi,
>>>
>>>>
>>>> On point 4, isn't that possible today, Can't that be achieved with the 
>>>> current token exchange approach, and the internal implementation of the 
>>>> endpoint?
>>>
>>>
>>> Unfortunately, no. Token exchange is not widely adopted yet: for example, 
>>> Keycloak has only partial support for it, and Authelia, or Authentik, have 
>>> no support for it at all.
>>>
>>> This, and a few other technical issues with the current internals of the 
>>> REST client, makes it nearly impossible to achieve a good integration of 
>>> Iceberg REST with the majority of popular OSS authorization servers.
>>>
>>> I am planning to start another email thread to discuss these 
>>> practicalities, but let's first reach consensus on the broader security 
>>> issues voiced here, before we tackle the details.
>>>
>>> Thanks,
>>>
>>> Alex Dutra
>>>
>>> On Tue, May 28, 2024 at 8:41 PM Amogh Jahagirdar <am...@tabular.io> wrote:
>>>>
>>>> I disagree with removing "/v1/oauth/tokens" and I think I also disagree 
>>>> with the premise that implementing that endpoint is required, but I can 
>>>> understand how that's not clear in the spec. I think we can address the 
>>>> required vs non-required discussion with the capabilities PR.
>>>>
>>>> It seems like another part of what's driving this discussion is some 
>>>> concern around how do we enforce REST catalog implementations which do 
>>>> implement this endpoint to make sure that the implementation is secure 
>>>> (for example to avoid the MITM example that was brought up). This is 
>>>> ultimately a runtime detail. To me it seems like if we make it clear that 
>>>> such an endpoint should be implemented respecting OAuth2 standards, and we 
>>>> know that OAuth2 compliance requires avoiding that MITM situation, then 
>>>> runtime implementations should just follow the spec there
>>>>
>>>> >3. Enable flexibility for Iceberg REST servers to opt for other
>>>> authorization mechanisms than OAuth 2.0.
>>>> >4. Enable REST servers to opt for integrating with any standard OAuth2 /
>>>> OIDC provider (e.g. Okta, Keycloak, Authelia).
>>>>
>>>> I agree with both of these points; again I don't think the intention is 
>>>> Oauth2 is the only way, but I think the capabilities PR will make that 
>>>> even more clear.
>>>> On point 4, isn't that possible today, Can't that be achieved with the 
>>>> current token exchange approach, and the internal implementation of the 
>>>> endpoint? Sorry if I missed that explanation.
>>>>
>>>> Thanks,
>>>>
>>>> Amogh Jahagirdar
>>>>
>>>> On Tue, May 28, 2024 at 11:13 AM Yufei Gu <flyrain...@gmail.com> wrote:
>>>>>
>>>>> Not an expert on authentication, but reading from the context, I agree 
>>>>> that it’s not a good practice to use a resource server as a token server. 
>>>>> The resource server would need to securely handle and store credentials 
>>>>> or tokens, increasing the risk of credential theft or leakage. Making the 
>>>>> token endpoint optional will mitigate the issue a bit. But if we want to 
>>>>> disable it completely, it's better to do it now to prevent any issues and 
>>>>> migration costs in the future. Can we have a consensus on it?
>>>>>
>>>>>
>>>>> I would prefer to deprecate it to prevent any intentional and 
>>>>> unintentional misuse. We will also need to change the clients since it 
>>>>> connects to the endpoint by default.
>>>>>
>>>>>
>>>>> Yufei
>>>>>
>>>>>
>>>>> On Tue, May 28, 2024 at 8:27 AM Jack Ye <yezhao...@gmail.com> wrote:
>>>>>>
>>>>>> Sounds like we should try to finalize a consensus around 
>>>>>> https://github.com/apache/iceberg/pull/9940, so that we make it very 
>>>>>> clear what APIs/features are optional.
>>>>>>
>>>>>> -Jack
>>>>>>
>>>>>> On Tue, May 28, 2024 at 7:25 AM Fokko Driesprong <fo...@apache.org> 
>>>>>> wrote:
>>>>>>>
>>>>>>> Hey Robert,
>>>>>>>
>>>>>>> Sorry for the late reply as I was out last week. I'm not an OAuth guru 
>>>>>>> either, but some context from my end.
>>>>>>>
>>>>>>>> * Credentials (for example username/password) must _never_ be sent to
>>>>>>>> the resource server, only to the authorization server.
>>>>>>>
>>>>>>>
>>>>>>> In an earlier discussion, it was agreed that the resource server can 
>>>>>>> also function as the authorization server. But the roles can also be 
>>>>>>> separate.
>>>>>>>
>>>>>>>> 1.2. As long as OAuth2 is the only mechanism supported by the Iceberg
>>>>>>>> client, make the existing client parameter “oauth2-server-uri”
>>>>>>>> mandatory. The Iceberg REST catalog must fail to initialize if the
>>>>>>>> “oauth2-server-uri” parameter is not defined.
>>>>>>>
>>>>>>>
>>>>>>> It can also be that there is no authentication in the case of an 
>>>>>>> internal REST catalog. For example, the iceberg-rest-image that we use 
>>>>>>> for integration tests in PyIceberg.
>>>>>>>
>>>>>>>> We think that Apache Iceberg REST Catalog spec should not mandate that 
>>>>>>>> a
>>>>>>>> catalog implementation responds to requests to produce Auth Tokens
>>>>>>>> (since the REST spec v1 defines a /v1/tokens endpoint, current
>>>>>>>> implementations have to take deliberate actions when responding to 
>>>>>>>> those
>>>>>>>> requests, whether with successful token responses or with “access
>>>>>>>> denied” or “unsupported” responses).
>>>>>>>
>>>>>>> The `/v1/tokens` endpoint is optional.
>>>>>>>
>>>>>>>> * Credentials (for example username/password) must _never_ be sent to
>>>>>>>> the resource server, only to the authorization server.
>>>>>>>
>>>>>>>
>>>>>>> I fully agree!
>>>>>>>
>>>>>>>> Even if an Iceberg REST server does not implement the 
>>>>>>>> ‘/v1/oauth/tokens’
>>>>>>>> endpoint, it can still receive requests to ‘/v1/oauth/tokens’ 
>>>>>>>> containing
>>>>>>>> clear text credentials, if clients are misconfigured (humans do make
>>>>>>>> mistakes) - it’s a non-zero risk - bad actors can implement/intercept
>>>>>>>> that  ‘/v1/oauth/tokens’ endpoint and just wait for misconfigured
>>>>>>>> clients to send credentials.
>>>>>>>
>>>>>>>
>>>>>>> I think the wording is chosen badly. It should not send any 
>>>>>>> credentials, but the code (as in this example by GCS).
>>>>>>>
>>>>>>>> I think Jack makes a good point with AWS SigV4 Authentication. I 
>>>>>>>> suppose, in REST Catalog implementations that support that auth 
>>>>>>>> method, the /v1/oauth/token Catalog REST endpoint is redundant.
>>>>>>>
>>>>>>>
>>>>>>> There are other cloud providers next to AWS.
>>>>>>>
>>>>>>> Kind regards,
>>>>>>> Fokko
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Op do 23 mei 2024 om 15:49 schreef Dmitri Bourlatchkov 
>>>>>>> <dmitri.bourlatch...@dremio.com.invalid>:
>>>>>>>>
>>>>>>>> I think Jack makes a good point with AWS SigV4 Authentication. I 
>>>>>>>> suppose, in REST Catalog implementations that support that auth 
>>>>>>>> method, the /v1/oauth/token Catalog REST endpoint is redundant.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Dmitri.
>>>>>>>>
>>>>>>>> On Thu, May 23, 2024 at 9:20 AM Jack Ye <yezhao...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> I do not know enough details about OAuth to make comments about this 
>>>>>>>>> issue, but just regarding the statement "OAuth2 is the only mechanism 
>>>>>>>>> supported by the Iceberg client", AWS Sigv4 auth is also supported, 
>>>>>>>>> at least in the Java client implementation. It would be nice if we 
>>>>>>>>> formalize that in the spec, at least define it as a generic 
>>>>>>>>> authorization header.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Jack Ye
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, May 23, 2024 at 2:51 AM Robert Stupp <sn...@snazy.de> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> Iceberg REST implementations, either accessible on the public 
>>>>>>>>>> internet
>>>>>>>>>> or inside an organization, are usually being secured using 
>>>>>>>>>> appropriate
>>>>>>>>>> authorization mechanisms. The Nessie team is looking at implementing 
>>>>>>>>>> the
>>>>>>>>>> Iceberg REST specification and have some questions around the 
>>>>>>>>>> security
>>>>>>>>>> endpoint(s) defined in the spec.
>>>>>>>>>>
>>>>>>>>>> TL;DR we have questions (potentially concerns) about having the
>>>>>>>>>> ‘/v1/oauth/tokens’ endpoint, for the reasons explained below. We 
>>>>>>>>>> think
>>>>>>>>>> that ‘/v1/oauth/tokens’ poses potential security and OAuth2 
>>>>>>>>>> compliance
>>>>>>>>>> issues, and imposes how authorization should be implemented.
>>>>>>>>>> * As an open table format, it would be good for Iceberg to focus on 
>>>>>>>>>> the
>>>>>>>>>> table format / catalog and not how authorization is implemented. The
>>>>>>>>>> existence of an OAuth endpoint pushes implementations to adopt
>>>>>>>>>> authorization using only OAuth, whereas the implementers might choose
>>>>>>>>>> several other ways to implement authorization (e.g. SAML). In our
>>>>>>>>>> opinion the spec should leave it open to the implementation to decide
>>>>>>>>>> how authorization will be implemented.
>>>>>>>>>> * The existence of that endpoint also pushes operators of Iceberg 
>>>>>>>>>> REST
>>>>>>>>>> endpoints into the authorization service business.
>>>>>>>>>> * Clients might expose their clear-text credentials to the wrong
>>>>>>>>>> service, if the (correct) OAuth endpoint is not configured (humans do
>>>>>>>>>> make mistakes).
>>>>>>>>>> * (Naive) Iceberg REST servers may proxy requests received for
>>>>>>>>>> ‘/v1/oauth/tokens’ - and effectively become a “man-in-the-middle”, 
>>>>>>>>>> which
>>>>>>>>>> is not fully compliant with the OAuth 2.0 specification.
>>>>>>>>>>
>>>>>>>>>> Our goals with this discussion are:
>>>>>>>>>> 1. Secure the Iceberg REST specification by preventing accidental
>>>>>>>>>> misuse/misimplementation.
>>>>>>>>>> 2. Prevent that Iceberg REST to get into dictating the “authorization
>>>>>>>>>> server specifics”.
>>>>>>>>>> 3. Enable flexibility for Iceberg REST servers to opt for other
>>>>>>>>>> authorization mechanisms than OAuth 2.0.
>>>>>>>>>> 4. Enable REST servers to opt for integrating with any standard 
>>>>>>>>>> OAuth2 /
>>>>>>>>>> OIDC provider (e.g. Okta, Keycloak, Authelia).
>>>>>>>>>>
>>>>>>>>>> OAuth 2.0 [1] is one of the common standards accepted in the 
>>>>>>>>>> industry.
>>>>>>>>>> It defines a secure mechanism to access resources (here: Iceberg REST
>>>>>>>>>> endpoints). The most important aspect for OAuth 2.0 resources is that
>>>>>>>>>> (Iceberg REST) servers do not (have to) support password 
>>>>>>>>>> authentication,
>>>>>>>>>> especially considering the security weaknesses inherent in passwords.
>>>>>>>>>> Compromised passwords result in compromised data protected by that 
>>>>>>>>>> password.
>>>>>>>>>>
>>>>>>>>>> Therefore OAuth 2.0 defines a set of strict rules. Some of these are:
>>>>>>>>>> * Credentials (for example username/password) must _never_ be sent to
>>>>>>>>>> the resource server, only to the authorization server.
>>>>>>>>>> * OAuth 2.0 refresh tokens must _never_ be sent to the resource 
>>>>>>>>>> server,
>>>>>>>>>> only to the authorization server. (“Unlike access tokens, refresh 
>>>>>>>>>> tokens
>>>>>>>>>> are intended for use only with authorization servers and are never 
>>>>>>>>>> sent
>>>>>>>>>> to resource servers.”, cite from section 1.5 of the OAuth RFC 6749.)
>>>>>>>>>> * While the OAuth RFC states "The authorization server may be the 
>>>>>>>>>> same
>>>>>>>>>> server as the resource server or a separate entity", this should not 
>>>>>>>>>> be
>>>>>>>>>> mandated. i.e the spec should be open to supporting implementations 
>>>>>>>>>> that
>>>>>>>>>> have the authorization server and resource server co-located as well 
>>>>>>>>>> as
>>>>>>>>>> separate.
>>>>>>>>>>
>>>>>>>>>> The Iceberg PR 4771 [2] added the OpenAPI path ‘/v1/oauth/tokens’,
>>>>>>>>>> intentionally marked to “To exchange client credentials (client ID 
>>>>>>>>>> and
>>>>>>>>>> secret) for an access token. This uses the client credentials flow.”
>>>>>>>>>> [3]. Technically: client ID and secret are submitted using a HTTP 
>>>>>>>>>> POST
>>>>>>>>>> request to that Iceberg REST endpoint.
>>>>>>>>>>
>>>>>>>>>> Having ‘/v1/oauth/tokens’ in the Iceberg REST specification can 
>>>>>>>>>> easily
>>>>>>>>>> be seen as a hard requirement. In order to implement this in 
>>>>>>>>>> compliance
>>>>>>>>>> with the OAuth 2.0 spec, that ‘/v1/oauth/tokens’ MUST be the
>>>>>>>>>> authorization server. If users do not (want to) implement an
>>>>>>>>>> authorization server, the only way to implement this 
>>>>>>>>>> ‘/v1/oauth/tokens’
>>>>>>>>>> endpoint would be to proxy ‘/v1/oauth/tokens’ to the actual
>>>>>>>>>> authorization server, which means, that this proxy technically 
>>>>>>>>>> becomes a
>>>>>>>>>> “man in the middle” - knowing both all credentials and all involved 
>>>>>>>>>> tokens.
>>>>>>>>>>
>>>>>>>>>> Even if an Iceberg REST server does not implement the 
>>>>>>>>>> ‘/v1/oauth/tokens’
>>>>>>>>>> endpoint, it can still receive requests to ‘/v1/oauth/tokens’ 
>>>>>>>>>> containing
>>>>>>>>>> clear text credentials, if clients are misconfigured (humans do make
>>>>>>>>>> mistakes) - it’s a non-zero risk - bad actors can implement/intercept
>>>>>>>>>> that  ‘/v1/oauth/tokens’ endpoint and just wait for misconfigured
>>>>>>>>>> clients to send credentials.
>>>>>>>>>>
>>>>>>>>>> Further usages of the REST Catalog API path ‘/v1/oauth/tokens’ are 
>>>>>>>>>> “To
>>>>>>>>>> exchange a client token and an identity token for a more specific 
>>>>>>>>>> access
>>>>>>>>>> token. This uses the token exchange flow.” and “To exchange an access
>>>>>>>>>> token for one with the same claims and a refreshed expiration period
>>>>>>>>>> This uses the token exchange flow.” Both usages should and can be
>>>>>>>>>> implemented differently.
>>>>>>>>>>
>>>>>>>>>> Apache Iceberg, as a table format project, should recommend 
>>>>>>>>>> protecting
>>>>>>>>>> sensitive information. But Iceberg should not mandate _how_ that
>>>>>>>>>> protection is implemented - but the Iceberg REST specification does
>>>>>>>>>> effectively mandate OAuth 2.0, because other Iceberg REST endpoints 
>>>>>>>>>> do
>>>>>>>>>> refer/require OAuth 2.0 specifics. Users that want to use other
>>>>>>>>>> mechanisms, because they are forced to do so by their organization,
>>>>>>>>>> would be locked out of Iceberg REST.
>>>>>>>>>>
>>>>>>>>>> Apache Iceberg should not mandate OAuth 2.0 as the only option - for 
>>>>>>>>>> the
>>>>>>>>>> sake of openness for the project and flexibility for the server
>>>>>>>>>> implementations.
>>>>>>>>>>
>>>>>>>>>> We think that Apache Iceberg REST Catalog spec should not mandate 
>>>>>>>>>> that a
>>>>>>>>>> catalog implementation responds to requests to produce Auth Tokens
>>>>>>>>>> (since the REST spec v1 defines a /v1/tokens endpoint, current
>>>>>>>>>> implementations have to take deliberate actions when responding to 
>>>>>>>>>> those
>>>>>>>>>> requests, whether with successful token responses or with “access
>>>>>>>>>> denied” or “unsupported” responses).
>>>>>>>>>>
>>>>>>>>>> We propose the following actions:
>>>>>>>>>> 1. Immediate mitigation:
>>>>>>>>>> 1.1. Remove the ‘/v1/oauth/tokens’ endpoint entirely from the 
>>>>>>>>>> Iceberg’s
>>>>>>>>>> OpenAPI spec w/o replacement.
>>>>>>>>>> 1.2. As long as OAuth2 is the only mechanism supported by the Iceberg
>>>>>>>>>> client, make the existing client parameter “oauth2-server-uri”
>>>>>>>>>> mandatory. The Iceberg REST catalog must fail to initialize if the
>>>>>>>>>> “oauth2-server-uri” parameter is not defined.
>>>>>>>>>> 1.3. Remove all fallbacks to the ‘/v1/oauth/tokens’ endpoint.
>>>>>>>>>> 1.4. Forbid or discourage the communication of tokens from any 
>>>>>>>>>> Iceberg
>>>>>>>>>> REST Catalog endpoint, both via the "token" property or with any of 
>>>>>>>>>> the
>>>>>>>>>> "urn:ietf:params:oauth:token-type:*" properties.
>>>>>>>>>> 2. As a follow up: We’d propose a couple of implementation fixes and
>>>>>>>>>> changes and test improvements.
>>>>>>>>>> 3. As a follow up: Define a discovery mechanism for both the Iceberg
>>>>>>>>>> REST base URI and OAuth 2.0 endpoints/discovery, which allows users 
>>>>>>>>>> to
>>>>>>>>>> use a single URI to securely access Iceberg REST endpoints.
>>>>>>>>>> 4. As a follow up: Not new, but we also want to improve the Iceberg 
>>>>>>>>>> REST
>>>>>>>>>> specification via the “new” REST proposal.
>>>>>>>>>>
>>>>>>>>>> We do not think that adding recommendations to inline-documentation 
>>>>>>>>>> is
>>>>>>>>>> enough to fully mitigate the above concerns.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> References:
>>>>>>>>>>
>>>>>>>>>> [1] RFC 6749 - The OAuth 2.0 Authorization Framework,
>>>>>>>>>> https://datatracker.ietf.org/doc/html/rfc6749
>>>>>>>>>> [2] Iceberg pull request 4771 - Core: Add OAuth2 to REST catalog 
>>>>>>>>>> spec -
>>>>>>>>>> https://github.com/apache/iceberg/pull/4771
>>>>>>>>>> [3] Iceberg pull request 4843 - Spec: Add more context about OAuth2 
>>>>>>>>>> to
>>>>>>>>>> the REST spec - https://github.com/apache/iceberg/pull/4843
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Robert Stupp
>>>>>>>>>> @snazy
>>>>>>>>>>

Reply via email to