I'd like to make a slightly different point regarding metadata files.

Currently the table spec does require that metadata be stored in a "file",
however there is no way to discover that file outside of a Catalog. In
other words, a client that is operating purely at the file level has no way
of determining what is the "current" metadata file and whether some
metadata file that is visible represents a valid (committed) state of the
table.

The REST Catalog API returns actual table metadata JSON, not a file
location of it.

I think / agree it would be worth detaching Iceberg table format concepts
from the storage details. Perhaps we could consider the concept of
exporting a Catalog to a set of files, which would then specify how files
are layed out (cross-referenced). In runtime the Catalog may choose to
export every change or do it periodically or on user request, etc.

Cheers,
Dmitri.

On Thu, Feb 29, 2024 at 6:39 PM Jack Ye <yezhao...@gmail.com> wrote:

> > For example, I cannot validate the atomic behaviors Glue claims, but I
> wouldn't assert that it is non-compliant because of that.
>
> I think these are not comparable claims because the API scope is
> completely different, but I don't think it's worth arguing in depth. Let's
> try to see if we can have some consensus.
>
> Based on what you said above, do you agree with the following 3 points?
>
> 1. Today, a table/view in any catalog including a REST spec-compatible
> catalog is an Iceberg table/view if and only if it points to a JSON
> metadata file in storage. This concept is a part of the Iceberg table/view
> spec. There is a debate to be had for if we want to remove this requirement
> or not. The argument for it (as Yufei said) is to use other storage for
> better performance. The argument against it (as Amogh said) is to keep
> Iceberg open source friendly through the JSON format.
>
> 2. Today, a table/view in any catalog including a REST spec-compatible
> catalog is an Iceberg table/view if and only if it behind the scene
> performs the atomic metadata file swap for every commit. This concept is a
> part of the Iceberg table/view spec. We should consider removing this
> requirement in the Iceberg table/view spec.
>
> 3. A table/view in an Iceberg REST spec-compatible catalog may or may not
> be an Iceberg table/view. The REST spec does not enforce this, and this
> stance will remain true going forward. For example, it could use the
> Iceberg table/view metadata structure but does not store the metadata in
> JSON file, or not use the metadata file swap commit procedure, or both, and
> in those cases it is not an Iceberg table/view. More extremely, it might be
> a totally different kind of table that is only surfaced through the REST
> models.
>
> -Jack
>
> On Thu, Feb 29, 2024 at 2:13 PM Daniel Weeks <daniel.c.we...@gmail.com>
> wrote:
>
>> > In that case are tables in a REST-compliant catalog still an Iceberg
>> table? I don't think so, because it is a table that only partially follows
>> the Iceberg table spec.
>>
>> If the catalog is REST compliant and complies with the Iceberg spec, they
>> are still Iceberg tables.  I can see there is an argument that if the
>> catalog is REST compliant but does not follow the commit requirements (or
>> aspects of the Iceberg spec), that you cannot call those Iceberg tables.
>> But the assertion that Iceberg tables in a REST catalog are de facto
>> non-compliant is incorrect.
>>
>> > I like the idea about validation for format compliance. But don't think
>> you can technically validate this. You can validate the static table to see
>> if it has all the Iceberg metadata components, but you can not validate the
>> internal behavior of the service during a commit to see if it really
>> atomically swapped a metadata file.
>>
>> Just because you cannot see/validate the implementation doesn't mean that
>> it is non-compliant.  For example, I cannot validate the atomic behaviors
>> Glue claims, but I wouldn't assert that it is non-compliant because of that.
>>
>> I do think there is a discussion to be had about if/when we might adjust
>> the storage/swap requirements, but to reinforce Amogh's point, removing
>> those requirements would impact the openness and accessibility of Iceberg,
>> which I feel would hamper adoption.
>>
>> -Dan
>>
>>
>>
>> On Thu, Feb 29, 2024 at 1:53 PM Yufei Gu <flyrain...@gmail.com> wrote:
>>
>>> We've periodically discussed removing the storage requirement and I
>>>> think there's a path forward to do that and would agree that standardizing
>>>> on REST, but I wouldn't say the justification for making this push is that
>>>> REST is not compliant so we can just ignore the table spec requirements.
>>>> There are a few more things to consider, which is that not everything
>>>> can use REST currently and making a hard cut away from file based metadata
>>>> could bifurcate access to Iceberg data.  There are also aspects to the spec
>>>> that reference the metadata paths (like metadata log, though it's
>>>> optional), but would likely need to be addressed.
>>>
>>>
>>> This is a bit off-topic. It makes sense to me to remove the storage
>>> requirement moving foward. The metadata.json file isn't necessary in the
>>> Rest catalog. For example, the rest catalog may not have the permission to
>>> write to the table owner's storage. It still can save it as a file of
>>> course, but doesn't quite make sense. Putting it in a key-value store or
>>> RDMS could be a better option.
>>>
>>> Given that we are going to remove the storage requirement. Should we
>>> avoid the file path in the current design for things like view spec? A
>>> solution like table identifier + version uuid may serve the purpose.
>>>
>>> Yufei
>>>
>>>
>>> On Thu, Feb 29, 2024 at 1:29 PM Jack Ye <yezhao...@gmail.com> wrote:
>>>
>>>> > There's no exemption that says if you're using REST you don't need to
>>>> follow the spec.  Why do you think that's the case?
>>>>
>>>> In that case are tables in a REST-compliant catalog still an Iceberg
>>>> table? I don't think so, because it is a table that only partially follows
>>>> the Iceberg table spec.
>>>>
>>>> I like the idea about validation for format compliance. But don't think
>>>> you can technically validate this. You can validate the static table to see
>>>> if it has all the Iceberg metadata components, but you can not validate the
>>>> internal behavior of the service during a commit to see if it really
>>>> atomically swapped a metadata file.
>>>>
>>>> So I think at minimum we should update the table/view spec to remove
>>>> the metadata file swap requirement. The Iceberg table/view spec should be a
>>>> pure format spec that specifies how the file is laid out in storage.
>>>>
>>>> -Jack
>>>>
>>>> On Thu, Feb 29, 2024 at 1:22 PM Amogh Jahagirdar <am...@tabular.io>
>>>> wrote:
>>>>
>>>>> I want to echo Dan's point that just because there is a separate spec
>>>>> for a REST Catalog does not mean that implementations can deviate from the
>>>>> spec's definition of the commit protocol or metadata layout, and still be
>>>>> considered "spec compliant".
>>>>>
>>>>> > Secondly, once we do that, we should declare REST spec as the
>>>>> official catalog spec to interact with Iceberg tables. Otherwise at least 
>>>>> I
>>>>> will be very tempted to just break the atomic pointer swap pattern and
>>>>> store the entire metadata using the Glue Table object to achieve much
>>>>> better performance and also Glue native feature integrations, and I think
>>>>> other players will be equally motivated to do something similar. That will
>>>>> lead to even more chaos in the Iceberg catalog space.
>>>>>
>>>>> On this, a second point I want to make is around the openness of this
>>>>> ecosystem. We all already know that openness (the file formats, the
>>>>> metadata layout, the spec itself) is a fundamental tenant of the project.
>>>>> If we take the provided example of removing the metadata JSON file and
>>>>> moving it to some other storage, I think that goes against this principle
>>>>> since a JSON file is quite open by definition. Going back to the first
>>>>> point, I think a catalog which has such a behavior would *not* be
>>>>> considered spec compliant. Another reason this is important is if we think
>>>>> about what's healthiest for all users of Iceberg, is to have a healthy 
>>>>> list
>>>>> of options for catalog choices. Storing the metadata JSON in non-open ways
>>>>> can make users lives harder for trying out new catalogs since now the
>>>>> metadata would be stored in their own way, and the users will have a 
>>>>> harder
>>>>> time accessing their own data.
>>>>>
>>>>> A last point I'd like to make is I think there's a good discussion to
>>>>> be had on how do we validate that a REST Catalog implementation is spec
>>>>> compliant. I think that's really beneficial for the ecosystem as a whole.
>>>>> Before that, I think first though we'd want to conclude on this topic
>>>>> itself.
>>>>>
>>>>> On Thu, Feb 29, 2024 at 12:29 PM Daniel Weeks <
>>>>> daniel.c.we...@gmail.com> wrote:
>>>>>
>>>>>> > REST spec-compliant catalog does not need to follow the Iceberg
>>>>>> spec to commit or store metadata
>>>>>>
>>>>>> If the REST implementation doesn't follow the Iceberg spec for commit
>>>>>> requirements, it's not compliant with the spec.  There's no exemption 
>>>>>> that
>>>>>> says if you're using REST you don't need to follow the spec.  Why do you
>>>>>> think that's the case?
>>>>>>
>>>>>> I don't believe there's a reason to say that the REST spec needs to
>>>>>> enforce the commit requirements either, that's a requirement of the 
>>>>>> Iceberg
>>>>>> spec and still needs to be complied with.
>>>>>>
>>>>>> -Dan
>>>>>>
>>>>>> On Thu, Feb 29, 2024 at 12:19 PM Jack Ye <yezhao...@gmail.com> wrote:
>>>>>>
>>>>>>> > The implementation of the spec can either be compliant or not.
>>>>>>>
>>>>>>> This is exactly the problem we are talking about right? Just to give
>>>>>>> an example, we cannot technically say that tables/views in the Tabular
>>>>>>> catalog are Iceberg tables/views, because a REST spec-compliant catalog
>>>>>>> does not need to follow the Iceberg spec to commit or store metadata. 
>>>>>>> Even
>>>>>>> if you say it is, there is no way to really prove that, because the REST
>>>>>>> spec does not enforce it.
>>>>>>>
>>>>>>> JB, what do you mean by participating on the Catalog RFC? Is there
>>>>>>> already an ongoing RFC?
>>>>>>>
>>>>>>> -Jack
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Feb 29, 2024 at 12:08 PM Jean-Baptiste Onofré <
>>>>>>> j...@nanthrax.net> wrote:
>>>>>>>
>>>>>>>> Hi Dan,
>>>>>>>>
>>>>>>>> I agree with your statement about REST Spec is not an implement but
>>>>>>>> I strongly disagree with your statement "impl of the spec can either be
>>>>>>>> compliant or not".
>>>>>>>>
>>>>>>>> The REST Catalog spec impl should be consistent with the REST Spec.
>>>>>>>> That's why a reference implementation in Iceberg would be a must, with 
>>>>>>>> a
>>>>>>>> TCK.
>>>>>>>>
>>>>>>>> The REST Spec should bridge/give access to Table/View metadata. I
>>>>>>>> think it would make sense to have a resource to GET the Table/View
>>>>>>>> metadata, also supporting PUT to update.
>>>>>>>> JSON Schema and eventually JSON RPC could help on some area here
>>>>>>>> (compliant with OpenAPI).
>>>>>>>>
>>>>>>>> In another thread, I propose to work on a Catalog RFC, exactly to
>>>>>>>> target this. I think it would make sense to have the REST/Catalog RFC 
>>>>>>>> as
>>>>>>>> the main catalog API, so it has to be both consistent (giving access to
>>>>>>>> table/view metadata) and extensible (via OpenAPI Extensions for 
>>>>>>>> instance).
>>>>>>>>
>>>>>>>> So, I agree with Jack: the minimum would be to have JSON metadata
>>>>>>>> exposed by the REST Spec.
>>>>>>>>
>>>>>>>> @Jack, short term I'm in favor of your proposal, long term, I
>>>>>>>> propose to participate on the Catalog RFC (REST Spec). WDYT ?
>>>>>>>>
>>>>>>>> Thanks !
>>>>>>>> Regards
>>>>>>>> JB
>>>>>>>>
>>>>>>>>
>>>>>>>> Le jeu. 29 févr. 2024 à 20:47, Daniel Weeks <
>>>>>>>> daniel.c.we...@gmail.com> a écrit :
>>>>>>>>
>>>>>>>>> Hey Jack,
>>>>>>>>>
>>>>>>>>> I'm not sure I agree with the framing of this argument.  The REST
>>>>>>>>> Spec defines a protocol, not an implementation.
>>>>>>>>>
>>>>>>>>> The implementation of the spec can either be compliant or not.  So
>>>>>>>>> a REST Implementation that adheres to all the requirements (atomic 
>>>>>>>>> location
>>>>>>>>> swap, json representation, etc.), would be compliant.  There's no
>>>>>>>>> requirement around who performs these operations and with REST, that 
>>>>>>>>> is
>>>>>>>>> delegated to the server.  The optional metadata location doesn't mean 
>>>>>>>>> that
>>>>>>>>> there isn't a metadata location, just that it may not be exposed 
>>>>>>>>> directly
>>>>>>>>> in the response.
>>>>>>>>>
>>>>>>>>> Therefore, an implementation where you just store the table
>>>>>>>>> metadata in a Glue Table object, would not be compliant, currently.
>>>>>>>>>
>>>>>>>>> We've periodically discussed removing the storage requirement and
>>>>>>>>> I think there's a path forward to do that and would agree that
>>>>>>>>> standardizing on REST, but I wouldn't say the justification for 
>>>>>>>>> making this
>>>>>>>>> push is that REST is not compliant so we can just ignore the table 
>>>>>>>>> spec
>>>>>>>>> requirements.
>>>>>>>>>
>>>>>>>>> There are a few more things to consider, which is that not
>>>>>>>>> everything can use REST currently and making a hard cut away from file
>>>>>>>>> based metadata could bifurcate access to Iceberg data.  There are also
>>>>>>>>> aspects to the spec that reference the metadata paths (like metadata 
>>>>>>>>> log,
>>>>>>>>> though it's optional), but would likely need to be addressed.
>>>>>>>>>
>>>>>>>>> -Dan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Feb 29, 2024 at 11:13 AM Jack Ye <yezhao...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi everyone,
>>>>>>>>>>
>>>>>>>>>> Just want to pull this specific topic out of the materialized
>>>>>>>>>> view discussion thread. I noticed this during the MV discussion, and 
>>>>>>>>>> I
>>>>>>>>>> think it is important to clarify this not just for the MV topic, but 
>>>>>>>>>> also
>>>>>>>>>> for the ongoing discussion to consolidate all the different catalogs.
>>>>>>>>>>
>>>>>>>>>> *How the table/view spec defines Iceberg table/view*
>>>>>>>>>>
>>>>>>>>>> If we look into the table/view spec, the optimistic concurrency
>>>>>>>>>> section <https://iceberg.apache.org/spec/#optimistic-concurrency>
>>>>>>>>>> requires the existence of a metadata file, and the atomic swap of the
>>>>>>>>>> metadata file ensures serializable isolation. This implies 2 things:
>>>>>>>>>> 1. the metadata file in a storage that holds the information
>>>>>>>>>> described in the rest of the spec.
>>>>>>>>>> 2. there is an object in a catalog that holds the pointer of the
>>>>>>>>>> metadata file. What object and what catalog is implementation 
>>>>>>>>>> dependent,
>>>>>>>>>> but these generalized concepts are always intact.
>>>>>>>>>>
>>>>>>>>>> The JSON serialization parts of the spec plus the reader
>>>>>>>>>> requirements also implies that the metadata file is in JSON format.
>>>>>>>>>>
>>>>>>>>>> So when we talk about an Iceberg table/view that is compliant
>>>>>>>>>> with the spec, it is the combination of all these 5 requirements:
>>>>>>>>>> 1. there is an object in the catalog representing this table/view
>>>>>>>>>> 2. there is a pointer to a JSON metadata file in the object
>>>>>>>>>> 3. the JSON metadata file exists in storage and contains the
>>>>>>>>>> table/view metadata content
>>>>>>>>>> 4. the metadata content is compliant with the standard described
>>>>>>>>>> in the spec
>>>>>>>>>> 5. serializable isolation is achieved by atomic swap of the
>>>>>>>>>> object pointer
>>>>>>>>>>
>>>>>>>>>> *How non-REST catalogs are compliant with the table/view spec*
>>>>>>>>>>
>>>>>>>>>> An implementation of the Iceberg table/view is essentially
>>>>>>>>>> specifying:
>>>>>>>>>> 1. what is the exact implementation of the catalog, e.g. JDBC,
>>>>>>>>>> Hive metastore (HMS), Glue, etc.
>>>>>>>>>> 2. what is the object that represents a table, e.g. a row in the
>>>>>>>>>> "iceberg_tables" table in JDBC, a Table object in HMS/Glue, etc.
>>>>>>>>>> 3. how is the JSON metadata file pointer stored, e.g. a column in
>>>>>>>>>> the table's row in JDBC, metadata_location key in the Table's 
>>>>>>>>>> parameter map
>>>>>>>>>> in HMS/Glue, etc.
>>>>>>>>>> 4. how the atomic swap is implemented, e.g. SQL atomic update in
>>>>>>>>>> JDBC, conditional parameter update in HMS, conditional version 
>>>>>>>>>> update in
>>>>>>>>>> Glue, etc.
>>>>>>>>>>
>>>>>>>>>> *How the REST spec is NOT compliant with the table/view spec*
>>>>>>>>>>
>>>>>>>>>> The REST spec technically does not match the following table/view
>>>>>>>>>> spec requirements:
>>>>>>>>>> 2. there is a pointer to a JSON metadata file in the object
>>>>>>>>>> 3. the JSON metadata file exists in storage and contains the
>>>>>>>>>> table/view metadata content
>>>>>>>>>> 5. serializable isolation is achieved by atomic swap of the
>>>>>>>>>> object pointer
>>>>>>>>>>
>>>>>>>>>> The key parts in REST spec that are not compliant are:
>>>>>>>>>> 1. metadata-location field is optional in LoadTableResponse
>>>>>>>>>> <https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L2721-L2728>
>>>>>>>>>> 2. pointer swap is not enforced in the UpdateTable
>>>>>>>>>> <https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L658>
>>>>>>>>>> operation
>>>>>>>>>>
>>>>>>>>>> Therefore, it opens the door for a REST service to be completely
>>>>>>>>>> not dependent on a JSON metadata file, store the Iceberg table/view
>>>>>>>>>> metadata not as a file, and achieve much better performance 
>>>>>>>>>> characteristics
>>>>>>>>>> than other catalogs. This technically gives a unique advantage for 
>>>>>>>>>> REST
>>>>>>>>>> catalog adopters that is not there for non-REST catalogs like HMS 
>>>>>>>>>> and Glue.
>>>>>>>>>>
>>>>>>>>>> *How can we fix this?*
>>>>>>>>>>
>>>>>>>>>> I suggest the following:
>>>>>>>>>>
>>>>>>>>>> Firstly, I think it is good that we try to remove the
>>>>>>>>>> requirements of JSON metadata file pointer and atomic pointer swap. 
>>>>>>>>>> We know
>>>>>>>>>> these requirements have perf limitations based on production usage,
>>>>>>>>>> especially when the metadata file is large. If that is the 
>>>>>>>>>> direction, we
>>>>>>>>>> should make it official by changing the table/view spec to say that 
>>>>>>>>>> those
>>>>>>>>>> requirements are catalog level implementation details that are no 
>>>>>>>>>> longer
>>>>>>>>>> required.
>>>>>>>>>>
>>>>>>>>>> Secondly, once we do that, we should declare REST spec as the
>>>>>>>>>> official catalog spec to interact with Iceberg tables. Otherwise at 
>>>>>>>>>> least I
>>>>>>>>>> will be very tempted to just break the atomic pointer swap pattern 
>>>>>>>>>> and
>>>>>>>>>> store the entire metadata using the Glue Table object to achieve much
>>>>>>>>>> better performance and also Glue native feature integrations, and I 
>>>>>>>>>> think
>>>>>>>>>> other players will be equally motivated to do something similar. 
>>>>>>>>>> That will
>>>>>>>>>> lead to even more chaos in the Iceberg catalog space.
>>>>>>>>>>
>>>>>>>>>> With REST spec as the official catalog spec, we can actually
>>>>>>>>>> support non-REST catalogs by using the HTTP execution chain handler. 
>>>>>>>>>> Dan
>>>>>>>>>> has already done a prototype here
>>>>>>>>>> <https://github.com/apache/iceberg/commit/619127ff69f89e43a1edef2ea94c3dd439396a8d#diff-869264a83ba9ca657e7defefaa16ad196b0de9fce6c87f97533db77f29e44762>
>>>>>>>>>> that is based on this discussion
>>>>>>>>>> <https://github.com/apache/iceberg/pull/8091#issuecomment-1647189146>
>>>>>>>>>> in the past about using AWS Lambda as an alternative HTTP client for 
>>>>>>>>>> REST
>>>>>>>>>> catalog. The same approach can be used to talk to HMS/Glue/JDBC/... 
>>>>>>>>>> while
>>>>>>>>>> users will only interact with the RESTCatalog as the entry point.
>>>>>>>>>>
>>>>>>>>>> I think this can provide a good path forward overall for the
>>>>>>>>>> catalog consolidation story, interested to know what others think.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Jack Ye
>>>>>>>>>>
>>>>>>>>>>

Reply via email to