Thank you Russell and Ryan. 

  Let me start to work on a new API to support force table registration in 
catalog.

Thanks,
Steve Zhang



> On Feb 10, 2025, at 4:29 PM, rdb...@gmail.com wrote:
> 
> Yeah, it sounds like a "register table force" is the right concept here. I 
> think we want to make sure that table updates remain change-based as the best 
> practice in the REST API. But there are some irregular use cases that justify 
> having some mechanism to completely replace the state (like push-based 
> mirroring). I think it makes sense to revisit mirroring and this use case and 
> come up with a path forward.
> 
> On Mon, Feb 10, 2025 at 3:12 PM Russell Spitzer <russell.spit...@gmail.com 
> <mailto:russell.spit...@gmail.com>> wrote:
>> I still would like a "register table" force" option
>> 
>> On Mon, Feb 10, 2025 at 5:06 PM Steve Zhang 
>> <hongyue_zh...@apple.com.invalid> wrote:
>>> Thank you Dan for your detailed reply. Based on your explanation, do you 
>>> think it would be worthwhile to support non-linear or complete metadata 
>>> replacements in the REST implementation? I am happy to contribute but might 
>>> need some guidance from the community on the best approach.
>>> 
>>> For additional context, we explored into the workaround of using a 
>>> combination of dropping table and re-registering the table with concerns of 
>>> reading in between. There’s also an attempt to add a force option to the 
>>> register-table API (https://github.com/apache/iceberg/pull/5327), which 
>>> would allow for metadata swap on an existing table. However, it was 
>>> suggested that use TableOperations.commit(base, new) is preferred to 
>>> achieve atomicity.
>>> 
>>> Thanks,
>>> Steve Zhang
>>> 
>>> 
>>> 
>>>> On Feb 10, 2025, at 1:49 PM, Daniel Weeks <dwe...@apache.org 
>>>> <mailto:dwe...@apache.org>> wrote:
>>>> 
>>>> Hey Steve,
>>>> 
>>>> I think the issue here is that you're using the commit api in table 
>>>> operations to perform a non-incremental/linear change to the metadata.  
>>>> The REST implementation is a little more strict in that it builds a set of 
>>>> updates based on the mutations made to the metadata and the commit process 
>>>> applies those changes.  In this scenario, no changes have been made and 
>>>> the call is attempting a complete replacement.
>>>> 
>>>> The other implementations are just blindly swapping the location, so while 
>>>> that operation does achieve the effect you're looking for, it's not the 
>>>> right semantics for the commit.
>>>> 
>>>> You might want to consider using the "register table" operation instead, 
>>>> which takes the table identifier and location to perform this type of swap.
>>>> 
>>>> -Dan
>>>> 
>>>> On Fri, Feb 7, 2025 at 10:17 AM Steve Zhang 
>>>> <hongyue_zh...@apple.com.invalid> wrote:
>>>>> Hey Iceberg Experts:
>>>>> 
>>>>>   I am seeking assistance and insights regarding an issue we’ve 
>>>>> encountered with RESTTableOperations and its inability to support 
>>>>> on-demand table metadata swaps. We are currently adopting the REST-based 
>>>>> catalog from Hive and have noticed a potential gap in the 
>>>>> TableOperations.commit() API. Typically, we use the commit API to revert 
>>>>> a table to a previously known state, as demonstrated below:
>>>>> 
>>>>> String deisredMetadataPath = 
>>>>> "/var/newdb/table/metadata/00003-579b23d1-4ca5-4acf-85ec-081e1699cb83.metadata.json""
>>>>> ops.commit(ops.current(), TableMetadataParser.read(ops.io 
>>>>> <http://ops.io/>(), dedeisredMetadataPath));
>>>>> 
>>>>>   However, this approach is no longer working with the REST-based 
>>>>> catalog. I suspect that the issue may be related to how the update type 
>>>>> is modeled in RESTTableOperations.  I have shared a unit test that 
>>>>> reproduces the problem on https://github.com/apache/iceberg/issues/12134, 
>>>>> where it works on JDBC and in-memory catalogs, but not with RESTCatalog. 
>>>>> 
>>>>> Best Regards,
>>>>> Steve Zhang
>>>>> 
>>>>> 
>>>>> 
>>> 

Reply via email to