Re: Spark: Copy Table Action

Yufei Gu Thu, 15 Aug 2024 14:12:53 -0700

Sorry for the late reply.

> I was wondering if we also want to support the use case of moving tables
in this proposal?


Pucheng, yes, we could use the action to move tables.

Hi Sumedh, here are my answers to your questions:

> Should the copied table registered in same catalog as the source table,
or they are copied in a different catalog for the destination table?

It is fine to register the table within the same catalog with different
table identifiers, as well as different table uuids, if your tools count on
it.

> Are we shooting for perfect query reproducibility for time travel queries
across the source and copy table? I.e. Is the snapshot chain on the source
table be maintained on the copied table?

The action will support it. Although, it is also accepted if you want to
copy from the middle of snapshot history, as it is common that users don't
care about certain table history. Overall, users need to make the decision
by themselves.

> Is this a one-time copy action, or is this something we can run on a
schedule, i.e. as new data is written to source table, incremental deltas
(appends, updates, deletes) will be copied?

It will support incremental copy so that you don't have to copy the whole
table every time. It isn't practical to copy the whole table every time due
to the large volume.

These answers are also covered in the goal section of this design doc,
https://docs.google.com/document/d/15oPj7ylgWQG8bhk_5aTjzHl7mlc-9f4OAH-oEpKavSc/edit#heading=h.97m5uqimprde
.

> Has the community considered an approach where the scheme and cluster is
minted by the catalog, to be used in the respective FileIO implementation
for the blob stores. For example, if we had a bucket foo on us-east, and
bucket bar on us-west, the catalog running on us-east would mint s3://foo,
and the catalog running on us-west would mint s3://bar, and the S3FileIO
would join that with rest of the relative path to the object. This would
allow us to capture the absolute path relative to s3://<bucket-name> in the
Iceberg metadata?

This is similar to S3 access point,
https://aws.amazon.com/s3/features/access-points/. You can use it as an
alternative if all your table storage locations are with s3.


Yufei


On Fri, Jul 12, 2024 at 10:09 AM Sumedh Sakdeo <ssak...@linkedin.com.invalid>
wrote:

> This is a useful addition, I believe it is important to list down
> requirements for such an action in greater details, especially what is in
> scope and what is not. Some open questions that could be added to the
> requirements / non-requirements section are
>
>    1. Should the copied table registered in same catalog as the source
>    table, or they are copied in a different catalog for the destination table?
>       1. This has implications on the table identifier, and how the
>       metadata is copied.
>    2. Are we shooting for perfect query reproducibility for time travel
>    queries across the source and copy table? I.e. Is the snapshot chain on the
>    source table be maintained on the copied table?
>       1. Spec talks about rebuilding metadata, but it would be clearer if
>       it said if the entire snapshot chain was maintained or we are
>       rebuilding metadata in a way that only data in current snapshot matches
>       between source and destination.
>    3. Is this a one-time copy action, or is this something we can run on
>    a schedule, i.e. as new data is written to source table, incremental deltas
>    (appends, updates, deletes) will be copied?
>       1. Later has implications to consider as various maintenance jobs
>       run on source and destination can diverge the snapshot chain.
>
>
> At LinkedIn, we ran into the absolute v/s relative path issue when
> designing snapshot replication for Iceberg tables. The way we approached it
> is we use absolute path of the file in the metadata, without the scheme and
> cluster. We use HadoopFileIO, the scheme and cluster is derived from the
> Hadoop conf. For example, if the file path is,
> hdfs://<cluster>/data/openhouse/db/tb_uuid, what is stored in Iceberg
> metadata is /data/openhouse/db/tb_uuid, and hdfs://<cluster> comes from
> Hadoop conf.
>
> Has the community considered an approach where the scheme and cluster is
> minted by the catalog, to be used in the respective FileIO implementation
> for the blob stores. For example, if we had a bucket foo on us-east, and
> bucket bar on us-west, the catalog running on us-east would mint s3://foo,
> and the catalog running on us-west would mint s3://bar, and the S3FileIO
> would join that with rest of the relative path to the object. This would
> allow us to capture the absolute path relative to s3://<bucket-name> in the
> Iceberg metadata?
>
> Thanks,
> -sumedh
>
> From: Pucheng Yang <pucheng.yo...@gmail.com>
> Date: Thursday, July 11, 2024 at 8:15 AM
> To: dev@iceberg.apache.org <dev@iceberg.apache.org>
> Subject: Re: Spark: Copy Table Action
>
> Hi Yufei, I was wondering if we also want to support the use case of
> moving tables in this proposal? For example, users might have various
> reasons to change the table location, however, there is no good way to move
> original data files to the new location unless we are doing data files
> rewrite, but it seems that we are misusing the functionality.
>
> On Wed, Jul 10, 2024 at 9:37 AM Ajantha Bhat <ajanthab...@gmail.com>
> wrote:
>
>> For RemoveExpiredFiles, I'm admittedly a bit skeptical if it's required
>>> since orphan file removal should be able to cleanup the files in the
>>> copied table. Are we able to elaborate why there's a concern with removing
>>> snapshots on the copied table and subsequently relying on orphan file
>>> removal on the copied table to remove the actual files? Is it around
>>> listing?
>>
>>
>> I have the same concern as Amogh. I already mentioned the same thing in
>> the PR yesterday
>> <https://github.com/apache/iceberg/pull/10643#discussion_r1669739401>.
>> I suggested renaming it as *RemoveTableCopyOrphanFiles. *Thinking more
>> on this today, I think we should atomically (implicitly) handle cleaning up
>> of orphan files as part of copy table action instead of a separate action.
>>
>> Also, very happy to see the progress on this one. This will help users to
>> move the data from one location to another seamlessly.
>>
>> - Ajantha
>>
>>
>> On Wed, Jul 10, 2024 at 7:35 AM Amogh Jahagirdar <2am...@gmail.com>
>> wrote:
>>
>>> Thanks Yufei!
>>>
>>> +1 on having a copy table action, I think that's pretty valuable. I have
>>> some ideas on interfaces based on previous work I've done for
>>> region/multi-cloud replication of Iceberg tables. The absolute vs relative
>>> path discussion is interesting, I have some questions on how relative
>>> pathing would look like but I'll wait for Anurag's input.
>>>
>>> On CheckSnapshotIntegrity, I think I'd probably advocate for having a
>>> more general "Repair Metadata" procedure. Currently, it looks like
>>> CheckSnapshotIntegrity just tells a user what files are missing in its
>>> output. I think we could go a step further and attempt to handle cases
>>> where a manifest entry refers to a file which no longer exists. We could
>>> attempt a recovery of that file if the fileIO implementation supports that
>>> via some sort of a SupportsRecovery mixin. There's also another corruption
>>> case where duplicate file entries end up in manifests, we can define an
>>> approach on reconciling that and write out new manifests.
>>> There's actually been two attempts on this, one from Szehon quite a
>>> while back https://github.com/apache/iceberg/pull/2608 and another more
>>> recently from Matt https://github.com/apache/iceberg/pull/10445 .
>>> Perhaps we could review both of these and figure out a path forward for
>>> this?
>>> For just verifying the integrity of the copy table, we could have a dry
>>> run option for the repair metadata operation which would output any missing
>>> files, or manifests with duplicates without performing any recovery/fixing
>>> up.
>>>
>>> For RemoveExpiredFiles, I'm admittedly a bit skeptical if it's required
>>> since orphan file removal should be able to cleanup the files in the
>>> copied table. Are we able to elaborate why there's a concern with removing
>>> snapshots on the copied table and subsequently relying on orphan file
>>> removal on the copied table to remove the actual files? Is it around
>>> listing?
>>>
>>> Overall this is great to see.
>>>
>>> Thanks,
>>> Amogh Jahagirdar
>>>
>>>
>>>
>>>
>>> On Tue, Jul 9, 2024 at 10:59 AM Anurag Mantripragada
>>> <amantriprag...@apple.com.invalid> wrote:
>>>
>>>> Agreed with Peter. I will bring relative paths changes up in the next
>>>> community sync. I will help drive this.
>>>>
>>>>
>>>> ~ Anurag Mantripragada
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Jul 8, 2024, at 10:50 PM, Péter Váry <peter.vary.apa...@gmail.com>
>>>> wrote:
>>>>
>>>> I think in most cases the copy table action doesn't require a query
>>>> engine to read and generate the new metadata files. This means, that it
>>>> would be nice to provide a pure Java implementation in the core, and it
>>>> could be extended/reused by different engines, like Spark, to execute it in
>>>> a distributed manner, when distributed execution is needed.
>>>>
>>>> About the copy vs. relative path debate:
>>>> - I have seen the relative path requirement coming up multiple times in
>>>> the past. Seems like a feature requested by multiple users, so I think it
>>>> would be the best to discuss it in a different thread. The Copy Table
>>>> Action might be used to move absolute path tables to relative path tables
>>>> when migration is needed.
>>>>
>>>> On Mon, Jul 8, 2024, 21:52 Anurag Mantripragada
>>>> <amantriprag...@apple.com.invalid> wrote:
>>>>
>>>>> Hi Yufei.
>>>>>
>>>>> Thanks for the proposal. While the actions are great, they still need
>>>>> to do a lot of work which can be reduced if we have the relative path
>>>>> changes. I still support adding these actions as moving data was out of
>>>>> scope for the relative path design and we can use these actions as helpers
>>>>> when the spec change is done.
>>>>>
>>>>> Anurag Mantripragada
>>>>>
>>>>> On Jul 8, 2024, at 10:55 AM, Pucheng Yang <pucheng.yo...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Thanks for picking this up, I think this is a very valuable addition.
>>>>>
>>>>> On Mon, Jul 8, 2024 at 10:48 AM Yufei Gu <flyrain...@gmail.com> wrote:
>>>>>
>>>>>> Hi folks,
>>>>>>
>>>>>> I'd like to share a recent progress of adding actions to copy tables
>>>>>> across different places.
>>>>>>
>>>>>> There is a constant need to copy tables across different places for
>>>>>> purposes such as disaster recovery and testing. Due to the absolute file
>>>>>> paths in Iceberg metadata, it doesn't work automatically. There are three
>>>>>> generic solutions:
>>>>>> 1. Rebuild the metadata: This is a proven approach widely used across
>>>>>> various companies.
>>>>>> 2. S3 access point: Effective when both the source and target
>>>>>> locations are in S3, but not applicable to other storage systems.
>>>>>> 3. Relative path: It requires changes to the table specification.
>>>>>>
>>>>>> We focus on the first approach in this thread. While the code has
>>>>>> been shared 2 years ago here
>>>>>> <https://github.com/apache/iceberg/pull/4705>, it has never been
>>>>>> merged. We picked it up recently. Here are the active PRs related to this
>>>>>> action. Would really appreciate any feedback and review:
>>>>>>
>>>>>>    - PR to add CopyTable action:
>>>>>>    https://github.com/apache/iceberg/pull/10024
>>>>>>    - PR to add CheckSnapshotIntegrity action:
>>>>>>    https://github.com/apache/iceberg/pull/10642
>>>>>>    - PR to add RemoveExpiredFiles action:
>>>>>>    https://github.com/apache/iceberg/pull/10643
>>>>>>
>>>>>> Here is a google doc with more details to clarify the goals and
>>>>>> approach:
>>>>>> https://docs.google.com/document/d/15oPj7ylgWQG8bhk_5aTjzHl7mlc-9f4OAH-oEpKavSc/edit?usp=sharing
>>>>>>
>>>>>> Yufei
>>>>>>
>>>>>
>>>>>
>>>>

Re: Spark: Copy Table Action

Reply via email to