I'd like to make a slightly different point regarding metadata files. Currently the table spec does require that metadata be stored in a "file", however there is no way to discover that file outside of a Catalog. In other words, a client that is operating purely at the file level has no way of determining what is the "current" metadata file and whether some metadata file that is visible represents a valid (committed) state of the table.
The REST Catalog API returns actual table metadata JSON, not a file location of it. I think / agree it would be worth detaching Iceberg table format concepts from the storage details. Perhaps we could consider the concept of exporting a Catalog to a set of files, which would then specify how files are layed out (cross-referenced). In runtime the Catalog may choose to export every change or do it periodically or on user request, etc. Cheers, Dmitri. On Thu, Feb 29, 2024 at 6:39 PM Jack Ye <yezhao...@gmail.com> wrote: > > For example, I cannot validate the atomic behaviors Glue claims, but I > wouldn't assert that it is non-compliant because of that. > > I think these are not comparable claims because the API scope is > completely different, but I don't think it's worth arguing in depth. Let's > try to see if we can have some consensus. > > Based on what you said above, do you agree with the following 3 points? > > 1. Today, a table/view in any catalog including a REST spec-compatible > catalog is an Iceberg table/view if and only if it points to a JSON > metadata file in storage. This concept is a part of the Iceberg table/view > spec. There is a debate to be had for if we want to remove this requirement > or not. The argument for it (as Yufei said) is to use other storage for > better performance. The argument against it (as Amogh said) is to keep > Iceberg open source friendly through the JSON format. > > 2. Today, a table/view in any catalog including a REST spec-compatible > catalog is an Iceberg table/view if and only if it behind the scene > performs the atomic metadata file swap for every commit. This concept is a > part of the Iceberg table/view spec. We should consider removing this > requirement in the Iceberg table/view spec. > > 3. A table/view in an Iceberg REST spec-compatible catalog may or may not > be an Iceberg table/view. The REST spec does not enforce this, and this > stance will remain true going forward. For example, it could use the > Iceberg table/view metadata structure but does not store the metadata in > JSON file, or not use the metadata file swap commit procedure, or both, and > in those cases it is not an Iceberg table/view. More extremely, it might be > a totally different kind of table that is only surfaced through the REST > models. > > -Jack > > On Thu, Feb 29, 2024 at 2:13 PM Daniel Weeks <daniel.c.we...@gmail.com> > wrote: > >> > In that case are tables in a REST-compliant catalog still an Iceberg >> table? I don't think so, because it is a table that only partially follows >> the Iceberg table spec. >> >> If the catalog is REST compliant and complies with the Iceberg spec, they >> are still Iceberg tables. I can see there is an argument that if the >> catalog is REST compliant but does not follow the commit requirements (or >> aspects of the Iceberg spec), that you cannot call those Iceberg tables. >> But the assertion that Iceberg tables in a REST catalog are de facto >> non-compliant is incorrect. >> >> > I like the idea about validation for format compliance. But don't think >> you can technically validate this. You can validate the static table to see >> if it has all the Iceberg metadata components, but you can not validate the >> internal behavior of the service during a commit to see if it really >> atomically swapped a metadata file. >> >> Just because you cannot see/validate the implementation doesn't mean that >> it is non-compliant. For example, I cannot validate the atomic behaviors >> Glue claims, but I wouldn't assert that it is non-compliant because of that. >> >> I do think there is a discussion to be had about if/when we might adjust >> the storage/swap requirements, but to reinforce Amogh's point, removing >> those requirements would impact the openness and accessibility of Iceberg, >> which I feel would hamper adoption. >> >> -Dan >> >> >> >> On Thu, Feb 29, 2024 at 1:53 PM Yufei Gu <flyrain...@gmail.com> wrote: >> >>> We've periodically discussed removing the storage requirement and I >>>> think there's a path forward to do that and would agree that standardizing >>>> on REST, but I wouldn't say the justification for making this push is that >>>> REST is not compliant so we can just ignore the table spec requirements. >>>> There are a few more things to consider, which is that not everything >>>> can use REST currently and making a hard cut away from file based metadata >>>> could bifurcate access to Iceberg data. There are also aspects to the spec >>>> that reference the metadata paths (like metadata log, though it's >>>> optional), but would likely need to be addressed. >>> >>> >>> This is a bit off-topic. It makes sense to me to remove the storage >>> requirement moving foward. The metadata.json file isn't necessary in the >>> Rest catalog. For example, the rest catalog may not have the permission to >>> write to the table owner's storage. It still can save it as a file of >>> course, but doesn't quite make sense. Putting it in a key-value store or >>> RDMS could be a better option. >>> >>> Given that we are going to remove the storage requirement. Should we >>> avoid the file path in the current design for things like view spec? A >>> solution like table identifier + version uuid may serve the purpose. >>> >>> Yufei >>> >>> >>> On Thu, Feb 29, 2024 at 1:29 PM Jack Ye <yezhao...@gmail.com> wrote: >>> >>>> > There's no exemption that says if you're using REST you don't need to >>>> follow the spec. Why do you think that's the case? >>>> >>>> In that case are tables in a REST-compliant catalog still an Iceberg >>>> table? I don't think so, because it is a table that only partially follows >>>> the Iceberg table spec. >>>> >>>> I like the idea about validation for format compliance. But don't think >>>> you can technically validate this. You can validate the static table to see >>>> if it has all the Iceberg metadata components, but you can not validate the >>>> internal behavior of the service during a commit to see if it really >>>> atomically swapped a metadata file. >>>> >>>> So I think at minimum we should update the table/view spec to remove >>>> the metadata file swap requirement. The Iceberg table/view spec should be a >>>> pure format spec that specifies how the file is laid out in storage. >>>> >>>> -Jack >>>> >>>> On Thu, Feb 29, 2024 at 1:22 PM Amogh Jahagirdar <am...@tabular.io> >>>> wrote: >>>> >>>>> I want to echo Dan's point that just because there is a separate spec >>>>> for a REST Catalog does not mean that implementations can deviate from the >>>>> spec's definition of the commit protocol or metadata layout, and still be >>>>> considered "spec compliant". >>>>> >>>>> > Secondly, once we do that, we should declare REST spec as the >>>>> official catalog spec to interact with Iceberg tables. Otherwise at least >>>>> I >>>>> will be very tempted to just break the atomic pointer swap pattern and >>>>> store the entire metadata using the Glue Table object to achieve much >>>>> better performance and also Glue native feature integrations, and I think >>>>> other players will be equally motivated to do something similar. That will >>>>> lead to even more chaos in the Iceberg catalog space. >>>>> >>>>> On this, a second point I want to make is around the openness of this >>>>> ecosystem. We all already know that openness (the file formats, the >>>>> metadata layout, the spec itself) is a fundamental tenant of the project. >>>>> If we take the provided example of removing the metadata JSON file and >>>>> moving it to some other storage, I think that goes against this principle >>>>> since a JSON file is quite open by definition. Going back to the first >>>>> point, I think a catalog which has such a behavior would *not* be >>>>> considered spec compliant. Another reason this is important is if we think >>>>> about what's healthiest for all users of Iceberg, is to have a healthy >>>>> list >>>>> of options for catalog choices. Storing the metadata JSON in non-open ways >>>>> can make users lives harder for trying out new catalogs since now the >>>>> metadata would be stored in their own way, and the users will have a >>>>> harder >>>>> time accessing their own data. >>>>> >>>>> A last point I'd like to make is I think there's a good discussion to >>>>> be had on how do we validate that a REST Catalog implementation is spec >>>>> compliant. I think that's really beneficial for the ecosystem as a whole. >>>>> Before that, I think first though we'd want to conclude on this topic >>>>> itself. >>>>> >>>>> On Thu, Feb 29, 2024 at 12:29 PM Daniel Weeks < >>>>> daniel.c.we...@gmail.com> wrote: >>>>> >>>>>> > REST spec-compliant catalog does not need to follow the Iceberg >>>>>> spec to commit or store metadata >>>>>> >>>>>> If the REST implementation doesn't follow the Iceberg spec for commit >>>>>> requirements, it's not compliant with the spec. There's no exemption >>>>>> that >>>>>> says if you're using REST you don't need to follow the spec. Why do you >>>>>> think that's the case? >>>>>> >>>>>> I don't believe there's a reason to say that the REST spec needs to >>>>>> enforce the commit requirements either, that's a requirement of the >>>>>> Iceberg >>>>>> spec and still needs to be complied with. >>>>>> >>>>>> -Dan >>>>>> >>>>>> On Thu, Feb 29, 2024 at 12:19 PM Jack Ye <yezhao...@gmail.com> wrote: >>>>>> >>>>>>> > The implementation of the spec can either be compliant or not. >>>>>>> >>>>>>> This is exactly the problem we are talking about right? Just to give >>>>>>> an example, we cannot technically say that tables/views in the Tabular >>>>>>> catalog are Iceberg tables/views, because a REST spec-compliant catalog >>>>>>> does not need to follow the Iceberg spec to commit or store metadata. >>>>>>> Even >>>>>>> if you say it is, there is no way to really prove that, because the REST >>>>>>> spec does not enforce it. >>>>>>> >>>>>>> JB, what do you mean by participating on the Catalog RFC? Is there >>>>>>> already an ongoing RFC? >>>>>>> >>>>>>> -Jack >>>>>>> >>>>>>> >>>>>>> On Thu, Feb 29, 2024 at 12:08 PM Jean-Baptiste Onofré < >>>>>>> j...@nanthrax.net> wrote: >>>>>>> >>>>>>>> Hi Dan, >>>>>>>> >>>>>>>> I agree with your statement about REST Spec is not an implement but >>>>>>>> I strongly disagree with your statement "impl of the spec can either be >>>>>>>> compliant or not". >>>>>>>> >>>>>>>> The REST Catalog spec impl should be consistent with the REST Spec. >>>>>>>> That's why a reference implementation in Iceberg would be a must, with >>>>>>>> a >>>>>>>> TCK. >>>>>>>> >>>>>>>> The REST Spec should bridge/give access to Table/View metadata. I >>>>>>>> think it would make sense to have a resource to GET the Table/View >>>>>>>> metadata, also supporting PUT to update. >>>>>>>> JSON Schema and eventually JSON RPC could help on some area here >>>>>>>> (compliant with OpenAPI). >>>>>>>> >>>>>>>> In another thread, I propose to work on a Catalog RFC, exactly to >>>>>>>> target this. I think it would make sense to have the REST/Catalog RFC >>>>>>>> as >>>>>>>> the main catalog API, so it has to be both consistent (giving access to >>>>>>>> table/view metadata) and extensible (via OpenAPI Extensions for >>>>>>>> instance). >>>>>>>> >>>>>>>> So, I agree with Jack: the minimum would be to have JSON metadata >>>>>>>> exposed by the REST Spec. >>>>>>>> >>>>>>>> @Jack, short term I'm in favor of your proposal, long term, I >>>>>>>> propose to participate on the Catalog RFC (REST Spec). WDYT ? >>>>>>>> >>>>>>>> Thanks ! >>>>>>>> Regards >>>>>>>> JB >>>>>>>> >>>>>>>> >>>>>>>> Le jeu. 29 févr. 2024 à 20:47, Daniel Weeks < >>>>>>>> daniel.c.we...@gmail.com> a écrit : >>>>>>>> >>>>>>>>> Hey Jack, >>>>>>>>> >>>>>>>>> I'm not sure I agree with the framing of this argument. The REST >>>>>>>>> Spec defines a protocol, not an implementation. >>>>>>>>> >>>>>>>>> The implementation of the spec can either be compliant or not. So >>>>>>>>> a REST Implementation that adheres to all the requirements (atomic >>>>>>>>> location >>>>>>>>> swap, json representation, etc.), would be compliant. There's no >>>>>>>>> requirement around who performs these operations and with REST, that >>>>>>>>> is >>>>>>>>> delegated to the server. The optional metadata location doesn't mean >>>>>>>>> that >>>>>>>>> there isn't a metadata location, just that it may not be exposed >>>>>>>>> directly >>>>>>>>> in the response. >>>>>>>>> >>>>>>>>> Therefore, an implementation where you just store the table >>>>>>>>> metadata in a Glue Table object, would not be compliant, currently. >>>>>>>>> >>>>>>>>> We've periodically discussed removing the storage requirement and >>>>>>>>> I think there's a path forward to do that and would agree that >>>>>>>>> standardizing on REST, but I wouldn't say the justification for >>>>>>>>> making this >>>>>>>>> push is that REST is not compliant so we can just ignore the table >>>>>>>>> spec >>>>>>>>> requirements. >>>>>>>>> >>>>>>>>> There are a few more things to consider, which is that not >>>>>>>>> everything can use REST currently and making a hard cut away from file >>>>>>>>> based metadata could bifurcate access to Iceberg data. There are also >>>>>>>>> aspects to the spec that reference the metadata paths (like metadata >>>>>>>>> log, >>>>>>>>> though it's optional), but would likely need to be addressed. >>>>>>>>> >>>>>>>>> -Dan >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Feb 29, 2024 at 11:13 AM Jack Ye <yezhao...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi everyone, >>>>>>>>>> >>>>>>>>>> Just want to pull this specific topic out of the materialized >>>>>>>>>> view discussion thread. I noticed this during the MV discussion, and >>>>>>>>>> I >>>>>>>>>> think it is important to clarify this not just for the MV topic, but >>>>>>>>>> also >>>>>>>>>> for the ongoing discussion to consolidate all the different catalogs. >>>>>>>>>> >>>>>>>>>> *How the table/view spec defines Iceberg table/view* >>>>>>>>>> >>>>>>>>>> If we look into the table/view spec, the optimistic concurrency >>>>>>>>>> section <https://iceberg.apache.org/spec/#optimistic-concurrency> >>>>>>>>>> requires the existence of a metadata file, and the atomic swap of the >>>>>>>>>> metadata file ensures serializable isolation. This implies 2 things: >>>>>>>>>> 1. the metadata file in a storage that holds the information >>>>>>>>>> described in the rest of the spec. >>>>>>>>>> 2. there is an object in a catalog that holds the pointer of the >>>>>>>>>> metadata file. What object and what catalog is implementation >>>>>>>>>> dependent, >>>>>>>>>> but these generalized concepts are always intact. >>>>>>>>>> >>>>>>>>>> The JSON serialization parts of the spec plus the reader >>>>>>>>>> requirements also implies that the metadata file is in JSON format. >>>>>>>>>> >>>>>>>>>> So when we talk about an Iceberg table/view that is compliant >>>>>>>>>> with the spec, it is the combination of all these 5 requirements: >>>>>>>>>> 1. there is an object in the catalog representing this table/view >>>>>>>>>> 2. there is a pointer to a JSON metadata file in the object >>>>>>>>>> 3. the JSON metadata file exists in storage and contains the >>>>>>>>>> table/view metadata content >>>>>>>>>> 4. the metadata content is compliant with the standard described >>>>>>>>>> in the spec >>>>>>>>>> 5. serializable isolation is achieved by atomic swap of the >>>>>>>>>> object pointer >>>>>>>>>> >>>>>>>>>> *How non-REST catalogs are compliant with the table/view spec* >>>>>>>>>> >>>>>>>>>> An implementation of the Iceberg table/view is essentially >>>>>>>>>> specifying: >>>>>>>>>> 1. what is the exact implementation of the catalog, e.g. JDBC, >>>>>>>>>> Hive metastore (HMS), Glue, etc. >>>>>>>>>> 2. what is the object that represents a table, e.g. a row in the >>>>>>>>>> "iceberg_tables" table in JDBC, a Table object in HMS/Glue, etc. >>>>>>>>>> 3. how is the JSON metadata file pointer stored, e.g. a column in >>>>>>>>>> the table's row in JDBC, metadata_location key in the Table's >>>>>>>>>> parameter map >>>>>>>>>> in HMS/Glue, etc. >>>>>>>>>> 4. how the atomic swap is implemented, e.g. SQL atomic update in >>>>>>>>>> JDBC, conditional parameter update in HMS, conditional version >>>>>>>>>> update in >>>>>>>>>> Glue, etc. >>>>>>>>>> >>>>>>>>>> *How the REST spec is NOT compliant with the table/view spec* >>>>>>>>>> >>>>>>>>>> The REST spec technically does not match the following table/view >>>>>>>>>> spec requirements: >>>>>>>>>> 2. there is a pointer to a JSON metadata file in the object >>>>>>>>>> 3. the JSON metadata file exists in storage and contains the >>>>>>>>>> table/view metadata content >>>>>>>>>> 5. serializable isolation is achieved by atomic swap of the >>>>>>>>>> object pointer >>>>>>>>>> >>>>>>>>>> The key parts in REST spec that are not compliant are: >>>>>>>>>> 1. metadata-location field is optional in LoadTableResponse >>>>>>>>>> <https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L2721-L2728> >>>>>>>>>> 2. pointer swap is not enforced in the UpdateTable >>>>>>>>>> <https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L658> >>>>>>>>>> operation >>>>>>>>>> >>>>>>>>>> Therefore, it opens the door for a REST service to be completely >>>>>>>>>> not dependent on a JSON metadata file, store the Iceberg table/view >>>>>>>>>> metadata not as a file, and achieve much better performance >>>>>>>>>> characteristics >>>>>>>>>> than other catalogs. This technically gives a unique advantage for >>>>>>>>>> REST >>>>>>>>>> catalog adopters that is not there for non-REST catalogs like HMS >>>>>>>>>> and Glue. >>>>>>>>>> >>>>>>>>>> *How can we fix this?* >>>>>>>>>> >>>>>>>>>> I suggest the following: >>>>>>>>>> >>>>>>>>>> Firstly, I think it is good that we try to remove the >>>>>>>>>> requirements of JSON metadata file pointer and atomic pointer swap. >>>>>>>>>> We know >>>>>>>>>> these requirements have perf limitations based on production usage, >>>>>>>>>> especially when the metadata file is large. If that is the >>>>>>>>>> direction, we >>>>>>>>>> should make it official by changing the table/view spec to say that >>>>>>>>>> those >>>>>>>>>> requirements are catalog level implementation details that are no >>>>>>>>>> longer >>>>>>>>>> required. >>>>>>>>>> >>>>>>>>>> Secondly, once we do that, we should declare REST spec as the >>>>>>>>>> official catalog spec to interact with Iceberg tables. Otherwise at >>>>>>>>>> least I >>>>>>>>>> will be very tempted to just break the atomic pointer swap pattern >>>>>>>>>> and >>>>>>>>>> store the entire metadata using the Glue Table object to achieve much >>>>>>>>>> better performance and also Glue native feature integrations, and I >>>>>>>>>> think >>>>>>>>>> other players will be equally motivated to do something similar. >>>>>>>>>> That will >>>>>>>>>> lead to even more chaos in the Iceberg catalog space. >>>>>>>>>> >>>>>>>>>> With REST spec as the official catalog spec, we can actually >>>>>>>>>> support non-REST catalogs by using the HTTP execution chain handler. >>>>>>>>>> Dan >>>>>>>>>> has already done a prototype here >>>>>>>>>> <https://github.com/apache/iceberg/commit/619127ff69f89e43a1edef2ea94c3dd439396a8d#diff-869264a83ba9ca657e7defefaa16ad196b0de9fce6c87f97533db77f29e44762> >>>>>>>>>> that is based on this discussion >>>>>>>>>> <https://github.com/apache/iceberg/pull/8091#issuecomment-1647189146> >>>>>>>>>> in the past about using AWS Lambda as an alternative HTTP client for >>>>>>>>>> REST >>>>>>>>>> catalog. The same approach can be used to talk to HMS/Glue/JDBC/... >>>>>>>>>> while >>>>>>>>>> users will only interact with the RESTCatalog as the entry point. >>>>>>>>>> >>>>>>>>>> I think this can provide a good path forward overall for the >>>>>>>>>> catalog consolidation story, interested to know what others think. >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Jack Ye >>>>>>>>>> >>>>>>>>>>