Aren't we forgetting about position delete files? If the table has position
delete files, then those contain absolute file paths as well.
We cannot add them to the table as-is. We need to rewrite them. This, I
think, is the most painful part of replicating an Iceberg table.
- Wing Yew


On Sat, Dec 2, 2023 at 5:23 PM Fokko Driesprong <fo...@apache.org> wrote:

> Hi Dongjun,
>
> Thanks for reaching out on the mailinglist. Another option might be to
> copy the data, and then use a Spark procedure, called add_files
> <https://iceberg.apache.org/docs/latest/spark-procedures/#add_files> to
> add the files to the table. Let me know if this works for you.
>
> Kind regards,
> Fokko
>
> Op za 2 dec 2023 om 02:43 schreef Ajantha Bhat <ajanthab...@gmail.com>:
>
>> Hi,
>>
>> You are right. Moving Iceberg tables from storage and expecting them to
>> function at the new location is not currently feasible.
>> The issue lies in the metadata files, which store the absolute path.
>>
>> To address this, we need support for relative paths, but it appears that
>> progress on this front has been slow.
>> You can monitor the status of this feature at
>> https://github.com/apache/iceberg/pull/8260.
>>
>> As a temporary fix, you can use the CTAS method to create a duplicate
>> copy of the table at the desired new path.
>>
>> Thanks,
>> Ajantha
>>
>> On Fri, Dec 1, 2023 at 10:01 PM Dongjun Hwang <enter09...@gmail.com>
>> wrote:
>>
>>> Hello! My name is Dongjun Hwang.
>>>
>>> I recently performed distcp on the iceberg table in Hadoop.
>>>
>>> Data search was not possible because all file paths in the metadata
>>> directory were not changed.
>>>
>>> Is there a way to distcp the iceberg table?
>>>
>>> thang you!!
>>>
>>

Reply via email to