Hi, yes I agree, I don't think we have to couple of spec version.
Regards JB On Wed, Dec 11, 2024 at 11:17 PM Russell Spitzer <russell.spit...@gmail.com> wrote: > > I want to float this back up, I think this is a really good idea for cross > engine support. I don't think we have to tie this to any specific Spec > version since they are just recommendations so I think we can do this at any > time > > On Wed, Nov 27, 2024 at 1:31 PM Szehon Ho <szehon.apa...@gmail.com> wrote: >> >> This makes sense to me generally, I've tried a few times to search in the >> spec to find a list of possible snapshot summary properties, and was a bit >> surprised to not find them there. So I think this would be a nice addition. >> >> I'm curious if there's any historical reason it's not been included in the >> spec. >> >> Thanks >> Szehon >> >> On Wed, Nov 27, 2024 at 10:55 AM Kevin Liu <kevinjq...@apache.org> wrote: >>> >>> Thanks for driving this Honah! >>> >>> It's important to have a consistent naming scheme so that we don't need to >>> worry about edge cases when using multiple engines, and possibly have to >>> deal with migrations. >>> >>> Also, since users can store arbitrary key/value pairs in the summary >>> property, it's good to document the currently used properties to avoid >>> collision. >>> >>> I like the proposal to document all properties in a "snapshot summary" >>> table, this will ensure a centralized place to view all possible key/value >>> pairs, similar to how FileIO configuration is handled in iceberg-python. >>> Other implementations can use this table as a reference. >>> >>> > This approach offers flexibility, as new fields can be added through >>> documentation updates without requiring specification changes. >>> This will save a lot of effort since specification changes require greater >>> scrutiny. >>> >>> > summary details would not be located near the Snapshot section, which >>> > explains the summary field. >>> We can link the table to the Snapshot section. >>> >>> >>> Would love to hear others' thoughts on this. >>> >>> Best, >>> Kevin Liu >>> >>> On Tue, Nov 26, 2024 at 2:50 PM Honah J. <hon...@apache.org> wrote: >>>> >>>> Hi everyone, >>>> >>>> I’d like to propose an addition to the table specification to document >>>> optional fields in the snapshot summary. >>>> >>>> Currently, the snapshot summary includes a required operation field and >>>> various optional fields. While these optional fields—such as metrics and >>>> partition-level summaries—are supported by Java and Python >>>> implementations, they are not officially documented. This creates risks of >>>> inconsistency as other implementations and engines adopt and interact with >>>> these fields. >>>> >>>> I propose adding a new section to the table specification to document >>>> these optional fields, ensuring consistent naming conventions and reducing >>>> ambiguity across implementations. While this is the primary proposal, it >>>> may also be worth discussing whether documenting these fields separately >>>> in Docs/Table would provide additional flexibility for future updates. >>>> >>>> I’d love to hear your thoughts, suggestions, or concerns about this >>>> proposal. >>>> >>>> Looking forward to the discussion! >>>> >>>> Links >>>> >>>> GitHub tracking issue: https://github.com/apache/iceberg/issues/11659 >>>> Proposal: >>>> https://docs.google.com/document/d/1Gt1ZOXVXK60IGdlmt4QlyRzaZ1iCVyYUBfMJCsiz14I/edit?usp=sharing >>>> PR: https://github.com/apache/iceberg/pull/11660 >>>> >>>> >>>> Best regards, >>>> Honah