conf. For example, if the file path is,
>>> hdfs:///data/openhouse/db/tb_uuid, what is stored in Iceberg
>>> metadata is /data/openhouse/db/tb_uuid, and hdfs:// comes from
>>> Hadoop conf.
>>>
>>> Has the community considered an approach where the sche
ch where the scheme and cluster is
>> minted by the catalog, to be used in the respective FileIO implementation
>> for the blob stores. For example, if we had a bucket foo on us-east, and
>> bucket bar on us-west, the catalog running on us-east would mint s3://foo,
>> and
running on us-east would mint s3://foo,
> and the catalog running on us-west would mint s3://bar, and the S3FileIO
> would join that with rest of the relative path to the object. This would
> allow us to capture the absolute path relative to s3:// in the
> Iceberg metadata?
>
>
: Spark: Copy Table Action
Hi Yufei, I was wondering if we also want to support the use case of moving
tables in this proposal? For example, users might have various reasons to
change the table location, however, there is no good way to move original data
files to the new location unless we are
Hi Yufei, I was wondering if we also want to support the use case of moving
tables in this proposal? For example, users might have various reasons to
change the table location, however, there is no good way to move original
data files to the new location unless we are doing data files rewrite, but
>
> For RemoveExpiredFiles, I'm admittedly a bit skeptical if it's required
> since orphan file removal should be able to cleanup the files in the
> copied table. Are we able to elaborate why there's a concern with removing
> snapshots on the copied table and subsequently relying on orphan file
> r
Thanks Yufei!
+1 on having a copy table action, I think that's pretty valuable. I have
some ideas on interfaces based on previous work I've done for
region/multi-cloud replication of Iceberg tables. The absolute vs relative
path discussion is interesting, I have some questions on how relative
path
Agreed with Peter. I will bring relative paths changes up in the next community
sync. I will help drive this.
~ Anurag Mantripragada
> On Jul 8, 2024, at 10:50 PM, Péter Váry wrote:
>
> I think in most cases the copy table action doesn't require a query engine to
> read and generate the
I think in most cases the copy table action doesn't require a query engine
to read and generate the new metadata files. This means, that it would be
nice to provide a pure Java implementation in the core, and it could be
extended/reused by different engines, like Spark, to execute it in a
distribut
Hi Yufei.
Thanks for the proposal. While the actions are great, they still need to do a
lot of work which can be reduced if we have the relative path changes. I still
support adding these actions as moving data was out of scope for the relative
path design and we can use these actions as helpe
Thanks for picking this up, I think this is a very valuable addition.
On Mon, Jul 8, 2024 at 10:48 AM Yufei Gu wrote:
> Hi folks,
>
> I'd like to share a recent progress of adding actions to copy tables
> across different places.
>
> There is a constant need to copy tables across different place
11 matches
Mail list logo