Hello,

I am interested in contributing to this effort.

Thanks,
Namratha

On Thu, May 29, 2025 at 1:36 PM Amogh Jahagirdar <2am...@gmail.com> wrote:

> Thanks for kicking this thread off Ryan, I'm interested in helping out
> here! I've been working on a proposal in this area and it would be great to
> collaborate with different folks and exchange ideas here, since I think a
> lot of people are interested in solving this problem.
>
> Thanks,
> Amogh Jahagirdar
>
> On Thu, May 29, 2025 at 2:25 PM Ryan Blue <rdb...@gmail.com> wrote:
>
>> Hi everyone,
>>
>> Like Russell’s recent note, I’m starting a thread to connect those of us
>> that are interested in the idea of changing Iceberg’s metadata in v4 so
>> that in most cases committing a change only requires writing one additional
>> metadata file.
>>
>> *Idea: One-file commits*
>>
>> The current Iceberg metadata structure requires writing at least one
>> manifest and a new manifest list to produce a new snapshot. The goal of
>> this work is to allow more flexibility by allowing the manifest list layer
>> to store data and delete files. As a result, only one file write would be
>> needed before committing the new snapshot. In addition, this work will also
>> try to explore:
>>
>>    - Avoiding small manifests that must be read in parallel and later
>>    compacted (metadata maintenance changes)
>>    - Extend metadata skipping to use aggregated column ranges that are
>>    compatible with geospatial data (manifest metadata)
>>    - Using soft deletes to avoid rewriting existing manifests (metadata
>>    DVs)
>>
>> If you’re interested in these problems, please reply!
>>
>> Ryan
>>
>

Reply via email to