Hi Ryan,

Thanks for the feedback. Unfortunately, I was not able to join the
Iceberg community sync meeting yesterday, I promise I will join the
next ones.

I think the proposal is very interesting and also the
discussion/comments in the document. I agree that some points should
be discussed further. I propose to update the document with your
points/questions.

Thanks !

Regards
JB

On Thu, Sep 21, 2023 at 2:02 AM Ryan Blue <b...@tabular.io> wrote:
>
> Renjie, thanks for the proposal.
>
> We talked about this today in the Iceberg community sync and the general 
> feedback was that we're excited work on this, but the proposal left a few 
> areas unclear. There are a few decisions about how to manage the delete 
> vectors that need to be added to the design. For example:
> 1. Would there be only one delete vector per data file?
> 2. Would this require merge of existing vectors and new deletes at write time?
> 3. How would the data file for a vector be identified?
> 4. If multiple vectors are allowed, what is the plan for keeping the number 
> of delete vectors small?
> 5. Would we allow writing multiple delete vectors into the same file?
> 6. How would we track which files are affected by a combined file of delete 
> vectors?
> 7. What are the details of the proposed file format?
>
> In short, we just want to better understand how all this would work.
>
> Thanks!
>
> Ryan
>
>
> On Mon, Sep 18, 2023 at 8:22 PM Renjie Liu <liurenjie2...@gmail.com> wrote:
>>
>> Hi, all:
>>
>>
>>
>> I have a proposal to introduce deletion vector file to reduce write 
>> amplification of iceberg table:
>>
>> https://docs.google.com/document/d/1FtPI0TUzMrPAFfWX_CA9NL6m6O1uNSxlpDsR-7xpPL0/edit?usp=sharing
>>
>>
>>
>> Welcome to comment, and looking forward to hear your advice.
>
>
>
> --
> Ryan Blue
> Tabular

Reply via email to