Sadly, I have missed the meeting :(

Quick question:
Was table rename / location change discussed for tables with relative paths?

AFAIK when a table rename happens then we do not move old data / metadata
files, we just change the root location of the new data / metadata files.
If I am correct about this then we might need to handle this differently
for tables with relative paths.

Thanks, Peter

On Fri, 13 Aug 2021, 15:12 Anjali Norwood, <anorw...@netflix.com.invalid>
wrote:

> Perfect, thank you Yufei.
>
> Regards
> Anjali
>
> On Thu, Aug 12, 2021 at 9:58 PM Yufei Gu <flyrain...@gmail.com> wrote:
>
>> Hi Anjali,
>>
>> Inline...
>> On Thu, Aug 12, 2021 at 5:31 PM Anjali Norwood
>> <anorw...@netflix.com.invalid> wrote:
>>
>>> Thanks for the summary Yufei.
>>> Sorry, if this was already discussed, I missed the meeting yesterday.
>>> Is there anything in the design that would prevent multiple roots from
>>> being in different aws regions?
>>>
>> No. DR is the major use case of relative paths, if not the only one. So,
>> it will support roots in different regions.
>>
>> For disaster recovery in the case of an entire aws region down or slow,
>>> is metastore still a point of failure or can metastore be stood up in a
>>> different region and could select a different root?
>>>
>> Normally, DR also requires a backup metastore, besides the storage(s3
>> bucket). In that case, the backup metastore will be in a different region
>> along with the table files. For example, the primary table is located in
>> region A as well as its metastore, the backup table is located in region B
>> as well as its metastore. The primary table root points to a path in region
>> A, while backup table root points to a path in region B.
>>
>>
>>> regards,
>>> Anjali.
>>>
>>> On Thu, Aug 12, 2021 at 11:35 AM Yufei Gu <flyrain...@gmail.com> wrote:
>>>
>>>> Here is a summary of yesterday's community sync-up.
>>>>
>>>>
>>>> Yufei gave a brief update on disaster recovery requirements and the
>>>> current progress of relative path approach.
>>>>
>>>>
>>>> Ryan: We all agreed that relative path is the way for disaster recovery.
>>>>
>>>>
>>>> *Multiple roots for the relative path*
>>>>
>>>> Ryan proposed an idea to enable multiple roots for a table, basically,
>>>> we can add a list of roots in table metadata, and use a selector to choose
>>>> different roots when we move the table from one place to another. The
>>>> selector reads a property to decide which root to use. The property could
>>>> be either from catalog or the table metadata, which is yet to be decided.
>>>>
>>>>
>>>> Here is an example I’d image:
>>>>
>>>>    1. Root1: hdfs://nn:8020/path/to/the/table
>>>>    2. Root2: s3://bucket1/path/to/the/table
>>>>    3. Root3: s3://bucket2/path/to/the/table
>>>>
>>>> *Relative path use case*
>>>>
>>>> We brainstormed use cases for relative paths. Please let us know if
>>>> there are any other use cases.
>>>>
>>>>    1. Disaster Recovery
>>>>    2. Jack: AWS s3 bucket alias
>>>>    3. Ryan: fall-back use case. In case that the root1 doesn’t work,
>>>>    the table falls back to root2, then root3. As Russell mentioned, it is
>>>>    challenging to do snapshot expiration and other table maintenance 
>>>> actions.
>>>>
>>>>
>>>> *Timeline*
>>>>
>>>> In terms of timeline, relative path could be a feature in Spec V3,
>>>> since Spec V1 and V2 assume absolute path in metadata.
>>>>
>>>>
>>>> *Misc*
>>>>
>>>>    1. Miao: How is the relative path compatible with the absolute path?
>>>>
>>>>    2. How do we migrate an existing table? Build a tool for that.
>>>>
>>>> Please let us know if you have any ideas, questions, or concerns.
>>>>
>>>>
>>>>
>>>> Yufei
>>>>
>>>

Reply via email to