Also, in S3’s case, my understanding is that instead of
write.object-storage.path/write.data.path, users must now make sure that the
location prefix must be short to get the benefits of appending a hash to the
data paths. For example, a large prefix like
"s3://somebucket/region/timestamp/folder
Hi Russell,
I don’t have see any major issues with your approach other than that it may
break some custimizability of locations. If I understand correctly, today
write.object-storage.path or write.metadata.path can be outside of the table
base location. With your suggestion, are we saying that
During a sync with Yufei and Anurag I had some thought on this proposal that I
wanted to share with the wider group. As Yufei has perviously noted, I'm
worried about the alternative configuration parameters like (folder-storage,
object-storage). Specifically i'm thinking about the issue of movin
Hi everyone,
Thanks for sharing your ideas and suggestions on this thread. I believe we have
consensus on supporting multiple roots for a table and storing relative paths
in metadata. We can start by adding this support in the initial phase. Yufei
and I have updated the design doc[1] with the
Yufei, answers inline:
On Mon, Aug 23, 2021 at 4:06 PM Yufei Gu wrote:
> @Ryan, how do these properties work with multiple table locations?
>
>1.
>
>write.metadata.path
>2.
>
>write.folder-storage.path
>3.
>
>write.object-storage.path
>
> The current logic with single tab
Jack, I agree with just about everything you've said.
On Sun, Aug 29, 2021 at 2:13 PM Jack Ye wrote:
> trying to catch up with the conversation here, just typing some of my
> thought process:
>
> Based on my understanding, there are in general 2 use cases:
>
> 1. multiple data copies in differen
Anjali, my thoughts are inline below:
On Mon, Aug 23, 2021 at 1:14 PM Anjali Norwood
wrote:
> *"The more I think about this, the more I like the solution to add
> multiple table roots to metadata, rather than removing table roots. Adding
> a way to plug in a root selector makes a lot of sense to
Are we planning to make sure that the tables with relative paths will
always contain every data/metadata file in single root folder?
This depends on the use case. For table mirroring, there would be a
back-end service making copies of all metadata, but not for other use
cases. For example, the ren
Jack and Ryan, one valid root at a time looks straightforward and good
enough for use cases like DR and certain table migration cases.
Here are questions for multiple valid roots at a time, which is needed by
federation use case, multiple storage tiers use case.
1. Metadata sync-up questions
trying to catch up with the conversation here, just typing some of my
thought process:
Based on my understanding, there are in general 2 use cases:
1. multiple data copies in different physical storages, which includes:
1.1 disaster recovery: if 1 storage is completely down, all access needs to
@Ryan, how do these properties work with multiple table locations?
1.
write.metadata.path
2.
write.folder-storage.path
3.
write.object-storage.path
The current logic with single table location is to honor these properties
on top of table location. In case of multiple roots(ta
Hi Ryan, All,
*"The more I think about this, the more I like the solution to add multiple
table roots to metadata, rather than removing table roots. Adding a way to
plug in a root selector makes a lot of sense to me and it ensures that the
metadata is complete (table location is set in metadata) a
@Ryan: If I understand correctly, currently there is a possibility to
change the root location of the table, and it will not change/move the old
data/metadata files created before the change, only the new data/metadata
files will be created in the new location.
Are we planning to make sure that th
Steven, here is my understanding. It depends on whether you want to move
the data. In the DR case, we do move the data, we expect data to be
identical from time to time, but not always be. In the case of S3 aliases,
different roots actually point to the same location, there is no data move,
and dat
For the multiple table roots, do we expect or ensure that the data are
identical across the different roots? or this is best-effort background
synchronization across the different roots?
On Sun, Aug 22, 2021 at 11:53 AM Ryan Blue wrote:
> Peter, I think that this feature would be useful when mov
Peter, I think that this feature would be useful when moving tables between
root locations or when you want to maintain multiple root locations.
Renames are orthogonal because a rename doesn't change the table location.
You may want to move the table after a rename, and this would help in that
case
Hi,
This thread is about disaster recovery and relative paths, but I wanted to
ask an orthogonal but related question.
Do we see disaster recovery as the only (or main) use case for
multi-region?
Is data residency requirement a use case for anybody? Is it possible to
shard an iceberg table across
Sadly, I have missed the meeting :(
Quick question:
Was table rename / location change discussed for tables with relative paths?
AFAIK when a table rename happens then we do not move old data / metadata
files, we just change the root location of the new data / metadata files.
If I am correct abou
Perfect, thank you Yufei.
Regards
Anjali
On Thu, Aug 12, 2021 at 9:58 PM Yufei Gu wrote:
> Hi Anjali,
>
> Inline...
> On Thu, Aug 12, 2021 at 5:31 PM Anjali Norwood
> wrote:
>
>> Thanks for the summary Yufei.
>> Sorry, if this was already discussed, I missed the meeting yesterday.
>> Is there
Hi Anjali,
Inline...
On Thu, Aug 12, 2021 at 5:31 PM Anjali Norwood
wrote:
> Thanks for the summary Yufei.
> Sorry, if this was already discussed, I missed the meeting yesterday.
> Is there anything in the design that would prevent multiple roots from
> being in different aws regions?
>
No. DR i
Thanks for the summary Yufei.
Sorry, if this was already discussed, I missed the meeting yesterday.
Is there anything in the design that would prevent multiple roots from
being in different aws regions? For disaster recovery in the case of an
entire aws region down or slow, is metastore still a poi
Here is a summary of yesterday's community sync-up.
Yufei gave a brief update on disaster recovery requirements and the current
progress of relative path approach.
Ryan: We all agreed that relative path is the way for disaster recovery.
*Multiple roots for the relative path*
Ryan proposed an
22 matches
Mail list logo