We have been hitting all the metadata problems you mentioned, Ryan. I’m 
on-board to help however I can to improve this area.


~ Anurag Mantripragada

> On Jun 3, 2025, at 2:22 AM, Huang-Hsiang Cheng <hua...@apple.com.INVALID> 
> wrote:
> 
> I am interested in this idea and looking forward to collaboration.
> 
> Thanks,
> Huang-Hsiang
> 
>> On Jun 2, 2025, at 10:14 AM, namratha mk <nmk...@gmail.com> wrote:
>> 
>> Hello,
>> 
>> I am interested in contributing to this effort. 
>> 
>> Thanks,
>> Namratha
>> 
>> On Thu, May 29, 2025 at 1:36 PM Amogh Jahagirdar <2am...@gmail.com 
>> <mailto:2am...@gmail.com>> wrote:
>>> Thanks for kicking this thread off Ryan, I'm interested in helping out 
>>> here! I've been working on a proposal in this area and it would be great to 
>>> collaborate with different folks and exchange ideas here, since I think a 
>>> lot of people are interested in solving this problem.
>>> 
>>> Thanks,
>>> Amogh Jahagirdar
>>> 
>>> On Thu, May 29, 2025 at 2:25 PM Ryan Blue <rdb...@gmail.com 
>>> <mailto:rdb...@gmail.com>> wrote:
>>>> Hi everyone,
>>>> 
>>>> Like Russell’s recent note, I’m starting a thread to connect those of us 
>>>> that are interested in the idea of changing Iceberg’s metadata in v4 so 
>>>> that in most cases committing a change only requires writing one 
>>>> additional metadata file.
>>>> 
>>>> Idea: One-file commits
>>>> 
>>>> The current Iceberg metadata structure requires writing at least one 
>>>> manifest and a new manifest list to produce a new snapshot. The goal of 
>>>> this work is to allow more flexibility by allowing the manifest list layer 
>>>> to store data and delete files. As a result, only one file write would be 
>>>> needed before committing the new snapshot. In addition, this work will 
>>>> also try to explore:
>>>> 
>>>> Avoiding small manifests that must be read in parallel and later compacted 
>>>> (metadata maintenance changes)
>>>> Extend metadata skipping to use aggregated column ranges that are 
>>>> compatible with geospatial data (manifest metadata)
>>>> Using soft deletes to avoid rewriting existing manifests (metadata DVs)
>>>> If you’re interested in these problems, please reply!
>>>> 
>>>> Ryan
>>>> 
> 

Reply via email to