Re: Drop table behavior

Yan Yan Tue, 23 Nov 2021 19:30:06 -0800

Thank you all for the feedback!

To clarify, *dropTable* method implementation in Iceberg library does do
its work of cleaning up all data + delete files correctly in normal
circumstances, and it's mostly the past metadata.json files and the
directories that are not cleaned up. Also after looking up the code, I
believe my previous statement of not handling the case of
rollback/compaction is incorrect, they should be handled since we continue
to write them as snapshot in metadata.json, which will be cleaned up in
current drop table logic (especially for rollback, I believe today we only
update the value of *current-snapshot-id *field and do not tamper any
snapshot).

Below are the four scenarios I believe that the drop table logic couldn't
handle today:
a. past metadata.json files
b. directories (which may be a violation of GDPR requirement if directory
path contains partition value)
c. partially written files or files not detected/deleted during snapshot
expiration, which should be the same problem we are solving with orphan
files removal procedure
d. manual intervention, e.g. if people update catalog to point to a past
metadata.json file (essentially an unofficial/incorrect way of rollback)

Among these, (a) can be solved by item 1 and (b), (c), (d) can be solved by
item 2 if the table owns those locations.

I couldn't think of any scenario where we explicitly did not remove all
past metadata.json files, and I should be able to create a PR soon. For
item 2, I think it's a reasonable assumption that any table created without
overriding metadata/data location should own the table location and we
should be able to remove the directory altogether, although I think since
the chances of hitting (c) and (d) from the scenarios above are relatively
under control, and I'm not sure if (b) is really a huge concern, I would
prefer the current drop table behavior to continue to be the default, and
users may set a new property if they want to remove the entire directory
when dropping table, to not alter the library behavior. Then we may invent
other table properties to indicate that when users supply a location
override, if such override should be considered as owned by the table. I
think I should be able to have some PR on this in the coming weeks.

I used to see people configuring their data to land in a directory and
mixed with other data for addressing s3 throttling; afterall it was a
recommendation from S3. Since the behavior of where data files should be
written to is currently controlled as a table property in Iceberg library (
LocationProviders
<https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/LocationProviders.java>
class), I think SQL users are able to do this.

Thank you!
Yan

On Tue, Nov 23, 2021 at 12:58 PM Yufei Gu <flyrain...@gmail.com> wrote:

> Piotr made a good point. The major use case to customize data file paths
> is the s3 path randomization due to the throttling issue. It looks like an
> exceptional use case. I’d also prefer to think of it that way, what if the
> s3 throttling issue is resolved, or mitigated to a way users can ignore it
> in the future. However, it is an exception we cannot ignore. There are so
> many users’ tables all in s3. We cannot let a table own the location in
> that case. We can still use a reachable set to remove as many files as
> possible. I'm not aware of other use cases with multiple locations. I'd
> love to hear more if any one can chime in.
>
>
> On the flip side, I agreed that tables with implicit location should own
> the location, which makes removing orphan files and purging tables easier.
>
> Yufei
>
>
> On Tue, Nov 23, 2021 at 3:17 AM Piotr Findeisen <pi...@starburstdata.com>
> wrote:
>
>> Hi,
>>
>> When you come from storage perspective, then the current design of 'not
>> owning' location makes sense.
>>
>> However, if you come from SQL perspective, then all this is impractical
>> limitation. Analysts and other SQL users want to be able to delete their
>> data  and must have confidence that all the data is removed.
>> Failing to do so may expose them to GDPR-related liabilities.
>>
>> Therefore we should work towards (2). For starters, we should be able to
>> assume that tables with implicit location, do own their location.
>> Then we should have an option to validate location ownership for tables
>> with explicit location.
>>
>> I don't know yet how tables with multiple locations fit into this
>> picture, or tables with manifest in one place, or data files in some other
>> places.
>> SQL users wouldn't create such tables though.
>>
>> BR
>> PF
>>
>>
>>
>>
>>
>> On Tue, Nov 23, 2021 at 4:32 AM Jack Ye <yezhao...@gmail.com> wrote:
>>
>>> +1 for item 1, the fact that we do not remove all data referenced by all
>>> metadata files seems like a bug to me that should be fixed. The table's
>>> pointer is already removed in the catalog with no way to rollback, so there
>>> is no reason for keeping those files around. I don't know if there is any
>>> historical context for us to only remove data in the latest metadata, maybe
>>> someone with context could provide more details.
>>>
>>> For item 2, I think this aligns with the discussion conclusions in the
>>> linked issues. At least we can have some flag saying the table location,
>>> data location and metadata location do not have other table data (a.k.a.
>>> the table owns those 3 prefixes), and then we can safely do a recursive
>>> deletion. This also seems to align with the intention for having a LIST API
>>> in FileIO discussed in https://github.com/apache/iceberg/issues/3212
>>> for remove_orphan_files.
>>>
>>> -Jack
>>>
>>>
>>> On Mon, Nov 22, 2021 at 6:42 PM Yan Yan <yyany...@gmail.com> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> Does anyone know across catalog implementations, when we drop tables
>>>> with *purge=true*, why do we only drop last metadata and files
>>>> referred by it, but not any of the previous metadata? e.g.
>>>>
>>>> *create iceberg table1*; <--- metadata.json-1
>>>> *insert into table1* ...; <--- metadata.json-2
>>>>
>>>> when I do *drop table1* after these two commands, `metadata.json-1`
>>>> will not be deleted. This will also mean if we rollback/compact table and
>>>> then drop, data files referred by some of the previous metadata files will
>>>> also not be deleted.
>>>>
>>>> I know the community used to talk about table location ownership for
>>>> file cleanup after dropping table (e.g.
>>>> https://github.com/apache/iceberg/issues/1764
>>>> https://github.com/trinodb/trino/issues/5616 ) but I'm not sure if
>>>> they could completely solve the problem since we can customize
>>>> metadata/data location, and I think we should still delete the past
>>>> metadata.json even if the table doesn't own any location.
>>>>
>>>> I was thinking about the following items:
>>>> 1. to make a change to delete past metadata.json files as well when the
>>>> table is dropped with *purge=true* (small change, doesn't tackle
>>>> rollback/compaction data files)
>>>> 2. add configuration regarding table's location ownership, and delete
>>>> underlying files in drop table if table owns location (more complicated)
>>>>
>>>> I think 1 should be relatively safe to do despite that it's a behavior
>>>> change, but want to run it through the community first.
>>>>
>>>> Thanks!
>>>> Yan
>>>>
>>>

Re: Drop table behavior

Reply via email to