Hi, Anton:
I've gone through the doc, and the Puffin Position Delete Files section
shares some similarity with the deletion vector approach. Is there any
conclusion about the discussion?
On Thu, Oct 12, 2023 at 12:11 AM Anton Okolnychyi
wrote:
> I tried to summarize notes from our previous discu
That's a fair point and agree on that.
I think having some kind of performance numbers and comparison would
be helpful (depending of the use cases).
Regards
JB
On Fri, Oct 13, 2023 at 7:27 AM Renjie Liu wrote:
>>
>> I'd say we should equally consider all incoming ideas at the moment and see
>>
>
> I'd say we should equally consider all incoming ideas at the moment and
> see what would work best. We haven't agreed on anything yet, so I'd
> encourage everyone to participate in the discussion.
Can't agree more. I think we share the same goal to improve performance of
iceberg, and welcome
I just realized I did not open comments, fixed now. Any feedback or alternative
ideas are more than welcome!
I'd say we should equally consider all incoming ideas at the moment and see
what would work best. We haven't agreed on anything yet, so I'd encourage
everyone to participate in the discu
Hi, Anton:
I've gone through the doc, and we are trying to solve the same problems of
position deletes, but with different approaches. It's quite interesting.
On Thu, Oct 12, 2023 at 12:11 AM Anton Okolnychyi
wrote:
> I tried to summarize notes from our previous discussions here:
>
> https://doc
I tried to summarize notes from our previous discussions here:
https://docs.google.com/document/d/1M4L6o-qnGRwGhbhkW8BnravoTwvCrJV8VvzVQDRJO5I/
I am going to iterate on the doc later today.
On 2023/10/11 07:06:07 Renjie Liu wrote:
> Hi, Russell:
>
>
> > The main things I’m still interested are
Hi, Russell:
> The main things I’m still interested are alternative approaches. I think
> that some of the work that Anton is working on have shown some different
> bottlenecks in applying delete files that I’m not sure are addressed by
> this proposal.
I'm also interested. Could you share some
The main things I’m still interested are alternative approaches. I think that some of the work that Anton is working on have shown some different bottlenecks in applying delete files that I’m not sure are addressed by this proposal.For example, this proposal suggests doing a 1 to 1 (or 1 rowgroup t
Hi, Ryan:
Thanks for your reply.
1. What is the exact file format for these on disk that you're proposing?
> Even if you're saying that it is what is produced by roaring bitmap, we
> need more information. Is that a portable format? Do you wrap it at all in
> the file to carry extra metadata? For
Thanks, Renjie. I went through and made some comments about what is still
not clear. Here's a summary:
1. What is the exact file format for these on disk that you're proposing?
Even if you're saying that it is what is produced by roaring bitmap, we
need more information. Is that a portable format?
Hi:
I have addressed most comments in the document. I would like to ask what's
the next step? Should we have a vote on this spec to reject it or we should
go on with it?
On Sat, Sep 30, 2023 at 11:20 PM Renjie Liu wrote:
> Hi:
> Sorry for the late reply, I have been busy recently. I've updated t
Hi:
Sorry for the late reply, I have been busy recently. I've updated the
design with more details about your questions, and here is a summary:
> 1. Would there be only one delete vector per data file?
Yes. It's possible that we have multiple deletion vectors per very large
data file to further re
Hi Ryan,
Thanks for the feedback. Unfortunately, I was not able to join the
Iceberg community sync meeting yesterday, I promise I will join the
next ones.
I think the proposal is very interesting and also the
discussion/comments in the document. I agree that some points should
be discussed furthe
Renjie, thanks for the proposal.
We talked about this today in the Iceberg community sync and the general
feedback was that we're excited work on this, but the proposal left a few
areas unclear. There are a few decisions about how to manage the delete
vectors that need to be added to the design. F
Hi, all:
I have a proposal to introduce deletion vector file to reduce write
amplification of iceberg table:
https://docs.google.com/document/d/1FtPI0TUzMrPAFfWX_CA9NL6m6O1uNSxlpDsR-7xpPL0/edit?usp=sharing
Welcome to comment, and looking forward to hear your advice.
15 matches
Mail list logo