Hi all, thank you very much for the progress so far, I believe this is very helpful to DR and other use cases that require copying the table from place to place. I can see we support rewriting table paths in the metadata files now, I wonder if you have a plan for the next step to have a more full integration (such as copy table end to end)? Thanks
On Thu, Aug 15, 2024 at 2:13 PM Yufei Gu <flyrain...@gmail.com> wrote: > Sorry for the late reply. > > > I was wondering if we also want to support the use case of moving tables > in this proposal? > > Pucheng, yes, we could use the action to move tables. > > Hi Sumedh, here are my answers to your questions: > > > Should the copied table registered in same catalog as the source table, > or they are copied in a different catalog for the destination table? > > It is fine to register the table within the same catalog with different > table identifiers, as well as different table uuids, if your tools count on > it. > > > Are we shooting for perfect query reproducibility for time travel > queries across the source and copy table? I.e. Is the snapshot chain on the > source table be maintained on the copied table? > > The action will support it. Although, it is also accepted if you want to > copy from the middle of snapshot history, as it is common that users don't > care about certain table history. Overall, users need to make the decision > by themselves. > > > Is this a one-time copy action, or is this something we can run on a > schedule, i.e. as new data is written to source table, incremental deltas > (appends, updates, deletes) will be copied? > > It will support incremental copy so that you don't have to copy the whole > table every time. It isn't practical to copy the whole table every time due > to the large volume. > > These answers are also covered in the goal section of this design doc, > https://docs.google.com/document/d/15oPj7ylgWQG8bhk_5aTjzHl7mlc-9f4OAH-oEpKavSc/edit#heading=h.97m5uqimprde > . > > > Has the community considered an approach where the scheme and cluster is > minted by the catalog, to be used in the respective FileIO implementation > for the blob stores. For example, if we had a bucket foo on us-east, and > bucket bar on us-west, the catalog running on us-east would mint s3://foo, > and the catalog running on us-west would mint s3://bar, and the S3FileIO > would join that with rest of the relative path to the object. This would > allow us to capture the absolute path relative to s3://<bucket-name> in the > Iceberg metadata? > > This is similar to S3 access point, > https://aws.amazon.com/s3/features/access-points/. You can use it as an > alternative if all your table storage locations are with s3. > > > Yufei > > > On Fri, Jul 12, 2024 at 10:09 AM Sumedh Sakdeo > <ssak...@linkedin.com.invalid> wrote: > >> This is a useful addition, I believe it is important to list down >> requirements for such an action in greater details, especially what is in >> scope and what is not. Some open questions that could be added to the >> requirements / non-requirements section are >> >> 1. Should the copied table registered in same catalog as the source >> table, or they are copied in a different catalog for the destination >> table? >> 1. This has implications on the table identifier, and how the >> metadata is copied. >> 2. Are we shooting for perfect query reproducibility for time travel >> queries across the source and copy table? I.e. Is the snapshot chain on >> the >> source table be maintained on the copied table? >> 1. Spec talks about rebuilding metadata, but it would be clearer >> if it said if the entire snapshot chain was maintained or we are >> rebuilding metadata in a way that only data in current snapshot matches >> between source and destination. >> 3. Is this a one-time copy action, or is this something we can run on >> a schedule, i.e. as new data is written to source table, incremental >> deltas >> (appends, updates, deletes) will be copied? >> 1. Later has implications to consider as various maintenance jobs >> run on source and destination can diverge the snapshot chain. >> >> >> At LinkedIn, we ran into the absolute v/s relative path issue when >> designing snapshot replication for Iceberg tables. The way we approached it >> is we use absolute path of the file in the metadata, without the scheme and >> cluster. We use HadoopFileIO, the scheme and cluster is derived from the >> Hadoop conf. For example, if the file path is, >> hdfs://<cluster>/data/openhouse/db/tb_uuid, what is stored in Iceberg >> metadata is /data/openhouse/db/tb_uuid, and hdfs://<cluster> comes from >> Hadoop conf. >> >> Has the community considered an approach where the scheme and cluster is >> minted by the catalog, to be used in the respective FileIO implementation >> for the blob stores. For example, if we had a bucket foo on us-east, and >> bucket bar on us-west, the catalog running on us-east would mint s3://foo, >> and the catalog running on us-west would mint s3://bar, and the S3FileIO >> would join that with rest of the relative path to the object. This would >> allow us to capture the absolute path relative to s3://<bucket-name> in the >> Iceberg metadata? >> >> Thanks, >> -sumedh >> >> From: Pucheng Yang <pucheng.yo...@gmail.com> >> Date: Thursday, July 11, 2024 at 8:15 AM >> To: dev@iceberg.apache.org <dev@iceberg.apache.org> >> Subject: Re: Spark: Copy Table Action >> >> Hi Yufei, I was wondering if we also want to support the use case of >> moving tables in this proposal? For example, users might have various >> reasons to change the table location, however, there is no good way to move >> original data files to the new location unless we are doing data files >> rewrite, but it seems that we are misusing the functionality. >> >> On Wed, Jul 10, 2024 at 9:37 AM Ajantha Bhat <ajanthab...@gmail.com> >> wrote: >> >>> For RemoveExpiredFiles, I'm admittedly a bit skeptical if it's required >>>> since orphan file removal should be able to cleanup the files in the >>>> copied table. Are we able to elaborate why there's a concern with removing >>>> snapshots on the copied table and subsequently relying on orphan file >>>> removal on the copied table to remove the actual files? Is it around >>>> listing? >>> >>> >>> I have the same concern as Amogh. I already mentioned the same thing in >>> the PR yesterday >>> <https://github.com/apache/iceberg/pull/10643#discussion_r1669739401> >>> . >>> I suggested renaming it as *RemoveTableCopyOrphanFiles. *Thinking more >>> on this today, I think we should atomically (implicitly) handle cleaning up >>> of orphan files as part of copy table action instead of a separate action. >>> >>> Also, very happy to see the progress on this one. This will help users >>> to move the data from one location to another seamlessly. >>> >>> - Ajantha >>> >>> >>> On Wed, Jul 10, 2024 at 7:35 AM Amogh Jahagirdar <2am...@gmail.com> >>> wrote: >>> >>>> Thanks Yufei! >>>> >>>> +1 on having a copy table action, I think that's pretty valuable. I >>>> have some ideas on interfaces based on previous work I've done for >>>> region/multi-cloud replication of Iceberg tables. The absolute vs relative >>>> path discussion is interesting, I have some questions on how relative >>>> pathing would look like but I'll wait for Anurag's input. >>>> >>>> On CheckSnapshotIntegrity, I think I'd probably advocate for having a >>>> more general "Repair Metadata" procedure. Currently, it looks like >>>> CheckSnapshotIntegrity just tells a user what files are missing in its >>>> output. I think we could go a step further and attempt to handle cases >>>> where a manifest entry refers to a file which no longer exists. We could >>>> attempt a recovery of that file if the fileIO implementation supports that >>>> via some sort of a SupportsRecovery mixin. There's also another corruption >>>> case where duplicate file entries end up in manifests, we can define an >>>> approach on reconciling that and write out new manifests. >>>> There's actually been two attempts on this, one from Szehon quite a >>>> while back https://github.com/apache/iceberg/pull/2608 and another >>>> more recently from Matt https://github.com/apache/iceberg/pull/10445 . >>>> Perhaps we could review both of these and figure out a path forward for >>>> this? >>>> For just verifying the integrity of the copy table, we could have a dry >>>> run option for the repair metadata operation which would output any missing >>>> files, or manifests with duplicates without performing any recovery/fixing >>>> up. >>>> >>>> For RemoveExpiredFiles, I'm admittedly a bit skeptical if it's required >>>> since orphan file removal should be able to cleanup the files in the >>>> copied table. Are we able to elaborate why there's a concern with removing >>>> snapshots on the copied table and subsequently relying on orphan file >>>> removal on the copied table to remove the actual files? Is it around >>>> listing? >>>> >>>> Overall this is great to see. >>>> >>>> Thanks, >>>> Amogh Jahagirdar >>>> >>>> >>>> >>>> >>>> On Tue, Jul 9, 2024 at 10:59 AM Anurag Mantripragada >>>> <amantriprag...@apple.com.invalid> wrote: >>>> >>>>> Agreed with Peter. I will bring relative paths changes up in the next >>>>> community sync. I will help drive this. >>>>> >>>>> >>>>> ~ Anurag Mantripragada >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Jul 8, 2024, at 10:50 PM, Péter Váry <peter.vary.apa...@gmail.com> >>>>> wrote: >>>>> >>>>> I think in most cases the copy table action doesn't require a query >>>>> engine to read and generate the new metadata files. This means, that it >>>>> would be nice to provide a pure Java implementation in the core, and it >>>>> could be extended/reused by different engines, like Spark, to execute it >>>>> in >>>>> a distributed manner, when distributed execution is needed. >>>>> >>>>> About the copy vs. relative path debate: >>>>> - I have seen the relative path requirement coming up multiple times >>>>> in the past. Seems like a feature requested by multiple users, so I think >>>>> it would be the best to discuss it in a different thread. The Copy Table >>>>> Action might be used to move absolute path tables to relative path tables >>>>> when migration is needed. >>>>> >>>>> On Mon, Jul 8, 2024, 21:52 Anurag Mantripragada >>>>> <amantriprag...@apple.com.invalid> wrote: >>>>> >>>>>> Hi Yufei. >>>>>> >>>>>> Thanks for the proposal. While the actions are great, they still need >>>>>> to do a lot of work which can be reduced if we have the relative path >>>>>> changes. I still support adding these actions as moving data was out of >>>>>> scope for the relative path design and we can use these actions as >>>>>> helpers >>>>>> when the spec change is done. >>>>>> >>>>>> Anurag Mantripragada >>>>>> >>>>>> On Jul 8, 2024, at 10:55 AM, Pucheng Yang <pucheng.yo...@gmail.com> >>>>>> wrote: >>>>>> >>>>>> Thanks for picking this up, I think this is a very valuable addition. >>>>>> >>>>>> On Mon, Jul 8, 2024 at 10:48 AM Yufei Gu <flyrain...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi folks, >>>>>>> >>>>>>> I'd like to share a recent progress of adding actions to copy tables >>>>>>> across different places. >>>>>>> >>>>>>> There is a constant need to copy tables across different places for >>>>>>> purposes such as disaster recovery and testing. Due to the absolute file >>>>>>> paths in Iceberg metadata, it doesn't work automatically. There are >>>>>>> three >>>>>>> generic solutions: >>>>>>> 1. Rebuild the metadata: This is a proven approach widely used >>>>>>> across various companies. >>>>>>> 2. S3 access point: Effective when both the source and target >>>>>>> locations are in S3, but not applicable to other storage systems. >>>>>>> 3. Relative path: It requires changes to the table specification. >>>>>>> >>>>>>> We focus on the first approach in this thread. While the code has >>>>>>> been shared 2 years ago here >>>>>>> <https://github.com/apache/iceberg/pull/4705>, it has never been >>>>>>> merged. We picked it up recently. Here are the active PRs related to >>>>>>> this >>>>>>> action. Would really appreciate any feedback and review: >>>>>>> >>>>>>> - PR to add CopyTable action: >>>>>>> https://github.com/apache/iceberg/pull/10024 >>>>>>> - PR to add CheckSnapshotIntegrity action: >>>>>>> https://github.com/apache/iceberg/pull/10642 >>>>>>> - PR to add RemoveExpiredFiles action: >>>>>>> https://github.com/apache/iceberg/pull/10643 >>>>>>> >>>>>>> Here is a google doc with more details to clarify the goals and >>>>>>> approach: >>>>>>> https://docs.google.com/document/d/15oPj7ylgWQG8bhk_5aTjzHl7mlc-9f4OAH-oEpKavSc/edit?usp=sharing >>>>>>> >>>>>>> Yufei >>>>>>> >>>>>> >>>>>> >>>>>