Re: Sequence number for ContentFiles

Steven Wu Wed, 26 Apr 2023 13:59:55 -0700

> Will a clock skew cause any issues w.r.t. relying on the snapshot commit
time? I think we allow a mismatch up to a minute in TableMetadata.


This is probably not a problem. typically the max allowed misalignment is
much longer than 1 minute.

> We also planned to expose file sequence number (different from data
sequence number). I believe you could lookup snapshot using that info.

That would work too.


On Wed, Apr 26, 2023 at 1:10 PM Jack Ye <yezhao...@gmail.com> wrote:

> +1 for using file sequence number. This work has been discussed for a long
> time but never got picked up, would be great if someone can drive it to
> completion.
>
> -Jack
>
> On Wed, Apr 26, 2023 at 12:03 PM Anton Okolnychyi
> <aokolnyc...@apple.com.invalid> wrote:
>
>> Will a clock skew cause any issues w.r.t. relying on the snapshot commit
>> time? I think we allow a mismatch up to a minute in TableMetadata.
>>
>> We also planned to expose file sequence number (different from data
>> sequence number). I believe you could lookup snapshot using that info.
>>
>> https://iceberg.apache.org/spec/#manifest-entry-fields
>>
>> - Anton
>>
>> On Apr 26, 2023, at 11:52 AM, Steven Wu <stevenz...@gmail.com> wrote:
>>
>> piggyback on this thread since we are discussing exposing more metadata
>> in ContentFile or FileScanTask. Flink source watermark alignment can
>> potentially leverage the snapshot timestamp (when data files are
>> committed/appended to the table).  Is it reasonable to expose some snapshot
>> metadata in the FileScanTask?
>>
>> This can help the Flink job to ensure two (or more) sources are
>> proceeding at similar paces. Some use cases may require column stats
>> (min-max values) for the watermark alignment. Some use cases can leverage
>> snapshot timestamps for the alignment purpose.
>>
>> On Wed, Apr 26, 2023 at 11:15 AM Anton Okolnychyi <
>> aokolnyc...@apple.com.invalid> wrote:
>>
>>> My initial thinking is that exposing sequence numbers on ContentFile is
>>> preferable (we would get it for free in scan tasks). That said, I’ll need
>>> to see how complicated the implementation would be. Exposing it on
>>> ContentScanTask is a viable alternative. However, we already have a
>>> precedent for assigning specId in InheritableMetadata.
>>>
>>> - Anton
>>>
>>> On Apr 26, 2023, at 10:41 AM, Ryan Blue <b...@tabular.io> wrote:
>>>
>>> Exposing sequence number makes sense for use cases like this. I also
>>> like the idea of exposing it through FileScanTask. That might be easier
>>> than trying to add it to ContentFile.
>>>
>>> Anton, what do you think about adding it to FileScanTask?
>>>
>>> On Wed, Apr 26, 2023 at 7:50 AM Anton Okolnychyi <
>>> aokolnyc...@apple.com.invalid> wrote:
>>>
>>>> It is actually my bad not following up on that after #5913 and #6002.
>>>> I’ll take a look at #5760 referenced below by the end of this week.
>>>>
>>>> The plan was to expose sequence numbers on ContentFile. It is needed in
>>>> a number of use cases.
>>>>
>>>> - Anton
>>>>
>>>> On Apr 26, 2023, at 4:55 AM, Gabor Kaszab <gaborkas...@apache.org>
>>>> wrote:
>>>>
>>>> Hey Iceberg Community,
>>>>
>>>> I know there has been a discussion previously about exposing the
>>>> sequence number on a ContentFile level, but if I'm not mistaken that
>>>> conversation didn't end with a consensus. I found some relevant PRs that
>>>> has been open for a while:
>>>> https://github.com/apache/iceberg/pull/5760
>>>> https://github.com/apache/iceberg/pull/4769 (merged into the above PR)
>>>>
>>>> The reason I bring this topic up is that we started investigating
>>>> recently how to add read support for equality deletes to Impala.
>>>> Apparently, implementation-wise we could save a lot of hassle if sequence
>>>> numbers were exposed on a file level through the API, preferably somewhere
>>>> around calling planFiles(). We could then have a virtual 'SEQUENCE_NUMBER'
>>>> when scanning the data and delete files (separate scanners) and could
>>>> easily filter the rows in the JOIN node that joins the rows from the data
>>>> files with the ones from the delete files. (wouldn't go into more depth 
>>>> atm)
>>>>
>>>> With this mail I'd like to revive this conversation with the hope of
>>>> eventually coming to a solution that satisfies all participants. I've been
>>>> thinking of implementation choices we have to somehow provide sequence
>>>> numbers for the files:
>>>> - Extending ContentFile with sequence number: I checked the above PRs
>>>> and IIUC the issue with this approach is that ContentFile is meant to be
>>>> immutable and by the time they are created we don't have sequence numbers
>>>> to populate the ContentFile object.
>>>> - Extend FileScanTask with the file-level sequence numbers so after
>>>> calling planFiles() we could retrieve these numbers via a new API call on
>>>> the FileScanTask.
>>>>
>>>> There might be many other ways to implement this and I'd love to hear
>>>> what people think and would be great to find a way that would help us out
>>>> on Impala.
>>>>
>>>> Cheers,
>>>> Gabor
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> Ryan Blue
>>> Tabular
>>>
>>>
>>>
>>

Re: Sequence number for ContentFiles

Reply via email to