Got it, thank you Ryan. The PR is just merged btw.

On Thu, Dec 7, 2023 at 3:22 PM Ryan Blue <b...@tabular.io> wrote:

> It's not my concern about this, it's why I think the rewrite procedure
> does what it does currently. What I mean is that I don't think the initial
> implementation didn't rewrite files across different partition specs
> because the manifest files themselves would have a different schema for the
> partition tuple. That makes passing the data around a bit harder if you
> want to use data frames.
>
> On Thu, Dec 7, 2023 at 1:12 PM Pucheng Yang <py...@pinterest.com.invalid>
> wrote:
>
>> Ryan, got it. Can you speak more for your first concern? If we are to
>> rewrite manifests of the same spec at a time, how will that lead to
>> inconsistent schema?
>>
>> As of now, we have a PR ready to be merged
>> https://github.com/apache/iceberg/pull/9242 please comment if you
>> suggest this to be put on hold, thanks
>>
>> On Thu, Dec 7, 2023 at 1:00 PM Ryan Blue <b...@tabular.io> wrote:
>>
>>> It sounds like a feature that we could add. I think there are two
>>> concerns. First, we don't want to process the files for multiple specs at
>>> the same time because we probably want a consistent schema. Second, there's
>>> probably some confusion over how to select a spec since we don't like users
>>> to need to work with IDs in the format directly.
>>>
>>> On Wed, Dec 6, 2023 at 10:28 PM Pucheng Yang <py...@pinterest.com.invalid>
>>> wrote:
>>>
>>>> Based on what I understand, it seems there is no particular reason, and
>>>> seems like a feature to be added on.
>>>>
>>>> On Wed, Dec 6, 2023 at 8:06 PM Pucheng Yang <py...@pinterest.com>
>>>> wrote:
>>>>
>>>>> Hi community,
>>>>>
>>>>> May I know why manifest rewrite will only touch files that have the
>>>>> latest spec id? What will be the suggestion if we want to rewrite manifest
>>>>> files that belong to non current spec id?
>>>>>
>>>>> Manifest selection logic:
>>>>> https://github.com/apache/iceberg/blob/6a9d3c77977baff4295ee2dde0150d73c8c46af1/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java#L295
>>>>>
>>>>> Best,
>>>>> Pucheng
>>>>>
>>>>
>>>
>>> --
>>> Ryan Blue
>>> Tabular
>>>
>>
>
> --
> Ryan Blue
> Tabular
>

Reply via email to