Thank you Dan for your detailed reply. Based on your explanation, do you think 
it would be worthwhile to support non-linear or complete metadata replacements 
in the REST implementation? I am happy to contribute but might need some 
guidance from the community on the best approach.

For additional context, we explored into the workaround of using a combination 
of dropping table and re-registering the table with concerns of reading in 
between. There’s also an attempt to add a force option to the register-table 
API (https://github.com/apache/iceberg/pull/5327), which would allow for 
metadata swap on an existing table. However, it was suggested that use 
TableOperations.commit(base, new) is preferred to achieve atomicity.

Thanks,
Steve Zhang



> On Feb 10, 2025, at 1:49 PM, Daniel Weeks <dwe...@apache.org> wrote:
> 
> Hey Steve,
> 
> I think the issue here is that you're using the commit api in table 
> operations to perform a non-incremental/linear change to the metadata.  The 
> REST implementation is a little more strict in that it builds a set of 
> updates based on the mutations made to the metadata and the commit process 
> applies those changes.  In this scenario, no changes have been made and the 
> call is attempting a complete replacement.
> 
> The other implementations are just blindly swapping the location, so while 
> that operation does achieve the effect you're looking for, it's not the right 
> semantics for the commit.
> 
> You might want to consider using the "register table" operation instead, 
> which takes the table identifier and location to perform this type of swap.
> 
> -Dan
> 
> On Fri, Feb 7, 2025 at 10:17 AM Steve Zhang <hongyue_zh...@apple.com.invalid> 
> wrote:
>> Hey Iceberg Experts:
>> 
>>   I am seeking assistance and insights regarding an issue we’ve encountered 
>> with RESTTableOperations and its inability to support on-demand table 
>> metadata swaps. We are currently adopting the REST-based catalog from Hive 
>> and have noticed a potential gap in the TableOperations.commit() API. 
>> Typically, we use the commit API to revert a table to a previously known 
>> state, as demonstrated below:
>> 
>> String deisredMetadataPath = 
>> "/var/newdb/table/metadata/00003-579b23d1-4ca5-4acf-85ec-081e1699cb83.metadata.json""
>> ops.commit(ops.current(), TableMetadataParser.read(ops.io 
>> <http://ops.io/>(), dedeisredMetadataPath));
>> 
>>   However, this approach is no longer working with the REST-based catalog. I 
>> suspect that the issue may be related to how the update type is modeled in 
>> RESTTableOperations.  I have shared a unit test that reproduces the problem 
>> on https://github.com/apache/iceberg/issues/12134, where it works on JDBC 
>> and in-memory catalogs, but not with RESTCatalog. 
>> 
>> Best Regards,
>> Steve Zhang
>> 
>> 
>> 

Reply via email to