Hi all,

We have two copies of the Iceberg REST specification in the Polaris code base:

* spec/iceberg-rest-catalog-open-api.yaml [1] and
* spec/polaris-catalog-service.yaml [2] - via [5] [6] [7] [8]

'iceberg-rest-catalog-open-api.yaml' is declared to be a 1:1 copy of Iceberg's v.1.7.1 'open-api/rest-catalog-open-api.yaml', but in fact it has already diverged beyond just code formatting.

'polaris-catalog-service.yaml' is based on a non-documented version of Iceberg's 'open-api/rest-catalog-open-api.yaml' and is quite different, but has hard dependencies on 'iceberg-rest-catalog-open-api.yaml'.


It is quite unclear, especially for (new) users, how those relate to each other.

While I think that Polaris definitely needs it's own set of APIs for it's genuine functionality, there should be a _clear_ separation from Iceberg's endpoints and Iceberg's types - both in the OpenAPI specs and in the endpoint path prefixes.

There is literally no guarantee that changes to Iceberg's OpenAPI spec will work in/via 'polaris-catalog-service.yaml'. Even the "most innocent" and non-breaking change to Iceberg's OpenAPI spec may fundamentally break Polaris's OpenAPI spec. This is a latent risk - and I think it is a quite serious risk.


There are also a couple of issues that have been copied to Polaris's polaris-catalog-service.yaml:

1. The '/v1/oauth/tokens' endpoint is already deprecated for removal
2. The path-encoding of namespaces and table-identifiers is already known to be incompatible with the current version 6.0 of the Servlet Specification, especially "3.5.2. URI Path Canonicalization" point 10 ("Rejecting Suspicious Sequences") [4] 3. The 'endpoints' array in Iceberg's 'CatalogConfig' type isn't portable to all use cases. (Despite that it's unnecessarily overly verbose IMHO.)


Apache Polaris does not own Apache Iceberg's OpenAPI spec. Iceberg is completely independent on how it shapes that spec.

Nobody knows how v1 will evolve nor how v2 of Iceberg's OpenAPI spec will look like. It is a big mistake and serious risk to assume that there will never be a change in Iceberg's OpenAPI spec that will not seriously affect or even break Polaris or introduce a lot of "backwards compatibility constructs".

Another POV is that Polaris's OpenAPI spec does not only focus on Iceberg but maybe also other table formats. Mixing other table formats with the Iceberg specs is at least confusing.

There's no guarantee that new endpoints/types added Iceberg's OpenAPI spec will not conflict with Polaris's endpoints/types.

While it's faster and easier to just rely on the _current_ version of the Iceberg OpenAPI spec, it will cause a lot of unnecessary work in the near-ish future.


I propose to let Polaris have it's own and **completely** independent OpenAPI spec and replace 'polaris-catalog-service.yaml' to ensure that Polaris's OpenAPI spec can never be broken by any Iceberg OpenAPI change.


Robert


[1] https://github.com/apache/polaris/blob/d33454874f69f952da2dfb301d0330d6e8d2296e/spec/iceberg-rest-catalog-open-api.yaml [2] https://github.com/apache/polaris/blob/d33454874f69f952da2dfb301d0330d6e8d2296e/spec/polaris-catalog-service.yaml [3] https://github.com/apache/iceberg/blob/apache-iceberg-1.7.1/open-api/rest-catalog-open-api.yaml [4] https://jakarta.ee/specifications/servlet/6.0/jakarta-servlet-spec-6.0#uri-path-canonicalization
[5] https://github.com/apache/polaris/pull/906
[6] https://github.com/apache/polaris/pull/936
[7] https://github.com/apache/polaris/pull/808
[8] https://github.com/apache/polaris/pull/1150

--
Robert Stupp
@snazy

Reply via email to