Hey Jack,

I'm not sure I agree with the framing of this argument.  The REST Spec
defines a protocol, not an implementation.

The implementation of the spec can either be compliant or not.  So a REST
Implementation that adheres to all the requirements (atomic location swap,
json representation, etc.), would be compliant.  There's no requirement
around who performs these operations and with REST, that is delegated to
the server.  The optional metadata location doesn't mean that there isn't a
metadata location, just that it may not be exposed directly in the response.

Therefore, an implementation where you just store the table metadata in a
Glue Table object, would not be compliant, currently.

We've periodically discussed removing the storage requirement and I think
there's a path forward to do that and would agree that standardizing on
REST, but I wouldn't say the justification for making this push is that
REST is not compliant so we can just ignore the table spec requirements.

There are a few more things to consider, which is that not everything can
use REST currently and making a hard cut away from file based metadata
could bifurcate access to Iceberg data.  There are also aspects to the spec
that reference the metadata paths (like metadata log, though it's
optional), but would likely need to be addressed.

-Dan



On Thu, Feb 29, 2024 at 11:13 AM Jack Ye <yezhao...@gmail.com> wrote:

> Hi everyone,
>
> Just want to pull this specific topic out of the materialized view
> discussion thread. I noticed this during the MV discussion, and I think it
> is important to clarify this not just for the MV topic, but also for the
> ongoing discussion to consolidate all the different catalogs.
>
> *How the table/view spec defines Iceberg table/view*
>
> If we look into the table/view spec, the optimistic concurrency section
> <https://iceberg.apache.org/spec/#optimistic-concurrency> requires the
> existence of a metadata file, and the atomic swap of the metadata file
> ensures serializable isolation. This implies 2 things:
> 1. the metadata file in a storage that holds the information described in
> the rest of the spec.
> 2. there is an object in a catalog that holds the pointer of the metadata
> file. What object and what catalog is implementation dependent, but these
> generalized concepts are always intact.
>
> The JSON serialization parts of the spec plus the reader requirements also
> implies that the metadata file is in JSON format.
>
> So when we talk about an Iceberg table/view that is compliant with the
> spec, it is the combination of all these 5 requirements:
> 1. there is an object in the catalog representing this table/view
> 2. there is a pointer to a JSON metadata file in the object
> 3. the JSON metadata file exists in storage and contains the table/view
> metadata content
> 4. the metadata content is compliant with the standard described in the
> spec
> 5. serializable isolation is achieved by atomic swap of the object pointer
>
> *How non-REST catalogs are compliant with the table/view spec*
>
> An implementation of the Iceberg table/view is essentially specifying:
> 1. what is the exact implementation of the catalog, e.g. JDBC, Hive
> metastore (HMS), Glue, etc.
> 2. what is the object that represents a table, e.g. a row in the
> "iceberg_tables" table in JDBC, a Table object in HMS/Glue, etc.
> 3. how is the JSON metadata file pointer stored, e.g. a column in the
> table's row in JDBC, metadata_location key in the Table's parameter map in
> HMS/Glue, etc.
> 4. how the atomic swap is implemented, e.g. SQL atomic update in JDBC,
> conditional parameter update in HMS, conditional version update in Glue,
> etc.
>
> *How the REST spec is NOT compliant with the table/view spec*
>
> The REST spec technically does not match the following table/view spec
> requirements:
> 2. there is a pointer to a JSON metadata file in the object
> 3. the JSON metadata file exists in storage and contains the table/view
> metadata content
> 5. serializable isolation is achieved by atomic swap of the object pointer
>
> The key parts in REST spec that are not compliant are:
> 1. metadata-location field is optional in LoadTableResponse
> <https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L2721-L2728>
> 2. pointer swap is not enforced in the UpdateTable
> <https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L658>
> operation
>
> Therefore, it opens the door for a REST service to be completely not
> dependent on a JSON metadata file, store the Iceberg table/view metadata
> not as a file, and achieve much better performance characteristics than
> other catalogs. This technically gives a unique advantage for REST catalog
> adopters that is not there for non-REST catalogs like HMS and Glue.
>
> *How can we fix this?*
>
> I suggest the following:
>
> Firstly, I think it is good that we try to remove the requirements of JSON
> metadata file pointer and atomic pointer swap. We know these requirements
> have perf limitations based on production usage, especially when the
> metadata file is large. If that is the direction, we should make it
> official by changing the table/view spec to say that those requirements are
> catalog level implementation details that are no longer required.
>
> Secondly, once we do that, we should declare REST spec as the official
> catalog spec to interact with Iceberg tables. Otherwise at least I will be
> very tempted to just break the atomic pointer swap pattern and store the
> entire metadata using the Glue Table object to achieve much better
> performance and also Glue native feature integrations, and I think other
> players will be equally motivated to do something similar. That will lead
> to even more chaos in the Iceberg catalog space.
>
> With REST spec as the official catalog spec, we can actually support
> non-REST catalogs by using the HTTP execution chain handler. Dan has
> already done a prototype here
> <https://github.com/apache/iceberg/commit/619127ff69f89e43a1edef2ea94c3dd439396a8d#diff-869264a83ba9ca657e7defefaa16ad196b0de9fce6c87f97533db77f29e44762>
> that is based on this discussion
> <https://github.com/apache/iceberg/pull/8091#issuecomment-1647189146> in
> the past about using AWS Lambda as an alternative HTTP client for REST
> catalog. The same approach can be used to talk to HMS/Glue/JDBC/... while
> users will only interact with the RESTCatalog as the entry point.
>
> I think this can provide a good path forward overall for the catalog
> consolidation story, interested to know what others think.
>
> Best,
> Jack Ye
>
>

Reply via email to