Thanks Robert for bringing this up!

I see we mentioned two problems in this thread:
1) polaris-catalog-service.yaml contains reference to all catalog APIs
including the ones for Iceberg and our Polaris specific APIs,
     which could be confusing to the users or new developers.
2) Polaris specific APIs also depend on the definition of
iceberg-rest-catalog-open-api.yaml. Since those are our own APIs,
     shall we completely separate the definition with Iceberg spec?

For 1, although we can say polaris-catalog-service.yaml contains
references to all catalog APIs, I can also see the point about *confusion*.
I think it makes sense for us to have different xxx-service.yaml, so that
the purpose for different services can be clearly separated. For example,
polaris-iceberg-catalog-service.yaml to contain reference to all Iceberg
APIs with prefix catalog/api/v1, and polaris-catalog-service.yaml
contains reference to all Polaris specific APIs with prefix
/catalog/api/polaris/v1. Jonas have a doc
<https://docs.google.com/document/d/1FAUnsfozCTca8CMwq04SHES1l2NwoiiXPlsS68s7Zh0/edit?tab=t.0#heading=h.ihz4p9n6lxnk>
to
talk about the restructuring

For 2, Since Polaris is Iceberg compatible, especially the concept of
namespace is reused across the whole Polaris service, I think it makes
sense for us to reuse some Iceberg definitions, including namespace,
prefix, and basic error definition. For API specific requests and
response definition, I don't think it is a good idea to reuse them.
If one day Iceberg changed the definition for things like namespace, and if
Polaris wants to upgrade and provide compatibility
with this new version, then I think we need to update this concept across
the whole service. Unless Polaris decide to introduce two
different concepts of namespace and diverge with Iceberg, which i don't
think it is the case at this moment.


Best Regards,
Yun






On Tue, Mar 18, 2025 at 7:11 AM Robert Stupp <sn...@snazy.de> wrote:

> Hi all,
>
> We have two copies of the Iceberg REST specification in the Polaris code
> base:
>
> * spec/iceberg-rest-catalog-open-api.yaml [1] and
> * spec/polaris-catalog-service.yaml [2] - via [5] [6] [7] [8]
>
> 'iceberg-rest-catalog-open-api.yaml' is declared to be a 1:1 copy of
> Iceberg's v.1.7.1 'open-api/rest-catalog-open-api.yaml', but in fact it
> has already diverged beyond just code formatting.
>
> 'polaris-catalog-service.yaml' is based on a non-documented version of
> Iceberg's 'open-api/rest-catalog-open-api.yaml' and is quite different,
> but has hard dependencies on 'iceberg-rest-catalog-open-api.yaml'.
>
>
> It is quite unclear, especially for (new) users, how those relate to
> each other.
>
> While I think that Polaris definitely needs it's own set of APIs for
> it's genuine functionality, there should be a _clear_ separation from
> Iceberg's endpoints and Iceberg's types - both in the OpenAPI specs and
> in the endpoint path prefixes.
>
> There is literally no guarantee that changes to Iceberg's OpenAPI spec
> will work in/via 'polaris-catalog-service.yaml'. Even the "most
> innocent" and non-breaking change to Iceberg's OpenAPI spec may
> fundamentally break Polaris's OpenAPI spec. This is a latent risk - and
> I think it is a quite serious risk.
>
>
> There are also a couple of issues that have been copied to Polaris's
> polaris-catalog-service.yaml:
>
> 1. The '/v1/oauth/tokens' endpoint is already deprecated for removal
> 2. The path-encoding of namespaces and table-identifiers is already
> known to be incompatible with the current version 6.0 of the Servlet
> Specification, especially "3.5.2. URI Path Canonicalization" point 10
> ("Rejecting Suspicious Sequences") [4]
> 3. The 'endpoints' array in Iceberg's 'CatalogConfig' type isn't
> portable to all use cases. (Despite that it's unnecessarily overly
> verbose IMHO.)
>
>
> Apache Polaris does not own Apache Iceberg's OpenAPI spec. Iceberg is
> completely independent on how it shapes that spec.
>
> Nobody knows how v1 will evolve nor how v2 of Iceberg's OpenAPI spec
> will look like. It is a big mistake and serious risk to assume that
> there will never be a change in Iceberg's OpenAPI spec that will not
> seriously affect or even break Polaris or introduce a lot of "backwards
> compatibility constructs".
>
> Another POV is that Polaris's OpenAPI spec does not only focus on
> Iceberg but maybe also other table formats. Mixing other table formats
> with the Iceberg specs is at least confusing.
>
> There's no guarantee that new endpoints/types added Iceberg's OpenAPI
> spec will not conflict with Polaris's endpoints/types.
>
> While it's faster and easier to just rely on the _current_ version of
> the Iceberg OpenAPI spec, it will cause a lot of unnecessary work in the
> near-ish future.
>
>
> I propose to let Polaris have it's own and **completely** independent
> OpenAPI spec and replace 'polaris-catalog-service.yaml' to ensure that
> Polaris's OpenAPI spec can never be broken by any Iceberg OpenAPI change.
>
>
> Robert
>
>
> [1]
>
> https://github.com/apache/polaris/blob/d33454874f69f952da2dfb301d0330d6e8d2296e/spec/iceberg-rest-catalog-open-api.yaml
> [2]
>
> https://github.com/apache/polaris/blob/d33454874f69f952da2dfb301d0330d6e8d2296e/spec/polaris-catalog-service.yaml
> [3]
>
> https://github.com/apache/iceberg/blob/apache-iceberg-1.7.1/open-api/rest-catalog-open-api.yaml
> [4]
>
> https://jakarta.ee/specifications/servlet/6.0/jakarta-servlet-spec-6.0#uri-path-canonicalization
> [5] https://github.com/apache/polaris/pull/906
> [6] https://github.com/apache/polaris/pull/936
> [7] https://github.com/apache/polaris/pull/808
> [8] https://github.com/apache/polaris/pull/1150
>
> --
> Robert Stupp
> @snazy
>
>

Reply via email to