Hi,

yes I agree, I don't think we have to couple of spec version.

Regards
JB

On Wed, Dec 11, 2024 at 11:17 PM Russell Spitzer
<russell.spit...@gmail.com> wrote:
>
> I want to float this back up, I think this is a really good idea for cross 
> engine support. I don't think we have to tie this to any specific Spec 
> version since they are just recommendations so I think we can do this at any 
> time
>
> On Wed, Nov 27, 2024 at 1:31 PM Szehon Ho <szehon.apa...@gmail.com> wrote:
>>
>> This makes sense to me generally, I've tried a few times to search in the 
>> spec to find a list of possible snapshot summary properties, and was a bit 
>> surprised to not find them there.  So I think this would be a nice addition.
>>
>> I'm curious if there's any historical reason it's not been included in the 
>> spec.
>>
>> Thanks
>> Szehon
>>
>> On Wed, Nov 27, 2024 at 10:55 AM Kevin Liu <kevinjq...@apache.org> wrote:
>>>
>>> Thanks for driving this Honah!
>>>
>>> It's important to have a consistent naming scheme so that we don't need to 
>>> worry about edge cases when using multiple engines, and possibly have to 
>>> deal with migrations.
>>>
>>> Also, since users can store arbitrary key/value pairs in the summary 
>>> property, it's good to document the currently used properties to avoid 
>>> collision.
>>>
>>> I like the proposal to document all properties in a "snapshot summary" 
>>> table, this will ensure a centralized place to view all possible key/value 
>>> pairs, similar to how FileIO configuration is handled in iceberg-python. 
>>> Other implementations can use this table as a reference.
>>>
>>>  > This approach offers flexibility, as new fields can be added through 
>>> documentation updates without requiring specification changes.
>>> This will save a lot of effort since specification changes require greater 
>>> scrutiny.
>>>
>>> > summary details would not be located near the Snapshot section, which 
>>> > explains the summary field.
>>> We can link the table to the Snapshot section.
>>>
>>>
>>> Would love to hear others' thoughts on this.
>>>
>>> Best,
>>> Kevin Liu
>>>
>>> On Tue, Nov 26, 2024 at 2:50 PM Honah J. <hon...@apache.org> wrote:
>>>>
>>>> Hi everyone,
>>>>
>>>> I’d like to propose an addition to the table specification to document 
>>>> optional fields in the snapshot summary.
>>>>
>>>> Currently, the snapshot summary includes a required operation field and 
>>>> various optional fields. While these optional fields—such as metrics and 
>>>> partition-level summaries—are supported by Java and Python 
>>>> implementations, they are not officially documented. This creates risks of 
>>>> inconsistency as other implementations and engines adopt and interact with 
>>>> these fields.
>>>>
>>>> I propose adding a new section to the table specification to document 
>>>> these optional fields, ensuring consistent naming conventions and reducing 
>>>> ambiguity across implementations. While this is the primary proposal, it 
>>>> may also be worth discussing whether documenting these fields separately 
>>>> in Docs/Table would provide additional flexibility for future updates.
>>>>
>>>> I’d love to hear your thoughts, suggestions, or concerns about this 
>>>> proposal.
>>>>
>>>> Looking forward to the discussion!
>>>>
>>>> Links
>>>>
>>>> GitHub tracking issue: https://github.com/apache/iceberg/issues/11659
>>>> Proposal: 
>>>> https://docs.google.com/document/d/1Gt1ZOXVXK60IGdlmt4QlyRzaZ1iCVyYUBfMJCsiz14I/edit?usp=sharing
>>>> PR: https://github.com/apache/iceberg/pull/11660
>>>>
>>>>
>>>> Best regards,
>>>> Honah

Reply via email to