Yeah, it sounds like a "register table force" is the right concept here. I
think we want to make sure that table updates remain change-based as
the best practice in the REST API. But there are some irregular use cases
that justify having some mechanism to completely replace the state (like
push-based mirroring). I think it makes sense to revisit mirroring and this
use case and come up with a path forward.

On Mon, Feb 10, 2025 at 3:12 PM Russell Spitzer <russell.spit...@gmail.com>
wrote:

> I still would like a "register table" force" option
>
> On Mon, Feb 10, 2025 at 5:06 PM Steve Zhang
> <hongyue_zh...@apple.com.invalid> wrote:
>
>> Thank you Dan for your detailed reply. Based on your explanation, do you
>> think it would be worthwhile to support non-linear or complete metadata
>> replacements in the REST implementation? I am happy to contribute but might
>> need some guidance from the community on the best approach.
>>
>> For additional context, we explored into the workaround of using a
>> combination of dropping table and re-registering the table with concerns of
>> reading in between. There’s also an attempt to add a force option to the
>> register-table API (https://github.com/apache/iceberg/pull/5327), which
>> would allow for metadata swap on an existing table. However, it was
>> suggested that use TableOperations.commit(base, new) is preferred to
>> achieve atomicity.
>>
>> Thanks,
>> Steve Zhang
>>
>>
>>
>> On Feb 10, 2025, at 1:49 PM, Daniel Weeks <dwe...@apache.org> wrote:
>>
>> Hey Steve,
>>
>> I think the issue here is that you're using the commit api in table
>> operations to perform a non-incremental/linear change to the metadata.  The
>> REST implementation is a little more strict in that it builds a set of
>> updates based on the mutations made to the metadata and the commit process
>> applies those changes.  In this scenario, no changes have been made and the
>> call is attempting a complete replacement.
>>
>> The other implementations are just blindly swapping the location, so
>> while that operation does achieve the effect you're looking for, it's not
>> the right semantics for the commit.
>>
>> You might want to consider using the "register table" operation instead,
>> which takes the table identifier and location to perform this type of swap.
>>
>> -Dan
>>
>> On Fri, Feb 7, 2025 at 10:17 AM Steve Zhang
>> <hongyue_zh...@apple.com.invalid> wrote:
>>
>>> Hey Iceberg Experts:
>>>
>>>   I am seeking assistance and insights regarding an issue we’ve
>>> encountered with RESTTableOperations and its inability to support on-demand
>>> table metadata swaps. We are currently adopting the REST-based catalog from
>>> Hive and have noticed a potential gap in the TableOperations.commit()
>>> API. Typically, we use the commit API to revert a table to a previously
>>> known state, as demonstrated below:
>>>
>>> String deisredMetadataPath =
>>> "/var/newdb/table/metadata/00003-579b23d1-4ca5-4acf-85ec-081e1699cb83.metadata.json""
>>> ops.commit(ops.current(), TableMetadataParser.read(ops.io(),
>>> dedeisredMetadataPath));
>>>
>>>   However, this approach is no longer working with the REST-based
>>> catalog. I suspect that the issue may be related to how the update type is
>>> modeled in RESTTableOperations.  I have shared a unit test that reproduces
>>> the problem on https://github.com/apache/iceberg/issues/12134, where it
>>> works on JDBC and in-memory catalogs, but not with RESTCatalog.
>>>
>>> Best Regards,
>>> Steve Zhang
>>>
>>>
>>>
>>>
>>

Reply via email to