Hey JB,

Thanks for raising this. This would be another way of indicating (next to
the format version) what's supported. At first glance, I'm reluctant to add
this. For two reasons:

   1. Because of the added complexity, both from a technical perspective,
   and because it also might confuse downstream users, for example, an engine
   does support Iceberg V3, but not variant type.
   2. As you indicated, this is similar to what Delta has. One issue that
   they are experiencing is that the users expect that you should also be able
   to disable features. For example, when you have row-lineage enabled, and
   you want to read the table with an engine that does not support
   row-lineage, there is an expectation to disable row-lineage. This is
   different from what we support today with the format-version which only
   allows upgrades (and not downgrades), this will also add a lot of
   complexity to the codebase.

Curious to learn what others think.

Kind regards,
Fokko

Op ma 14 apr 2025 om 19:56 schreef Brian Hulette <bhule...@apache.org>:

> As a consumer of Iceberg metadata I think something like this might be
> helpful. We used approach #2 for adding partial Iceberg V2 support to
> BigQuery external tables, but this was more straightforward as we just had
> to detect the existence of delete files. With V3 we will have to be very
> confident that we can detect all of the unsupported features before we add
> support for any one of them.
>
> That being said I don't think that will be *that* difficult. Would it be
> very hard for metadata producers to populate this?
>
> On Mon, Apr 14, 2025 at 8:48 AM Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
>
>> Hi folks,
>>
>> I started to work on multi args transforms, and you probably saw
>> Fokko's proposal about the way to deal with source-id/source-ids to
>> ensure backward compatibility.
>>
>> While working on the changes on iceberg-core/iceberg-java, I'm
>> wondering if we should not introduce Iceberg Features on metadata.
>> Let me explain what I have in mind.
>> In Table Spec V3, we have new functionalities: new types (timestamp
>> nz, variant, ...), default values, row lineage, etc.
>> For readers/writers, there are two ways to know if functionalities are
>> available or not:
>> 1. Reading the table version spec (v2, v3)
>> 2. Reading if metadata contains some fields (for instance, regarding
>> multi args transforms, we have source-id / source-ids).
>> It means that we already have to "parse" the metadata and likely
>> implement "complex" logic.
>>
>> In addition of table spec version, I wonder if we should not introduce
>> Iceberg Features in metadata, clearly listing/describing the supported
>> features, decoupled from table spec version:
>>
>> "features": ["row_lineage","variant","default_value"]
>>
>> Reader/writer can just check the features to know how to behave. We
>> would like more flexible to support features, unbinding from the table
>> spec version.
>>
>> Afaik, Delta has something similar.
>>
>> Long term, it could be extended to Data File format API proposed by
>> Peter, e.g. some features related to data files (that would be a
>> different layer, but similar idea).
>>
>> Thoughts ?
>>
>> Regards
>> JB
>>
>

Reply via email to