Thanks Ryan, I agree that the spec should stay separate from
language-specific APIs, and implementation notes are the right place for
this property today.

I'd ask the community to stay open to a future path here though since
implementation notes carry limited weight with vendors (since many vendors
don't support writing custom table properties or supporting them). Without
spec-level backing, there's no basis to push for consistency.

I have seen this with Snowflake's OPTIMIZED storage serialization policy
which writes Parquet V2 DELTA encodings and Spark clusters below 3.3 fail
hard with "UnsupportedOperationException: Unsupported encoding:
DELTA_BYTE_ARRAY" resulting in data inaccessibility. These are especially
messy since different teams even within the same org work across engines
and different versions of the same engine.

I also think that there is precedence where Java API implementations drove
spec changes when the interop gap became clear.  This is the case for
storage credentials (tied to Java's FileIO implementation) which started as
generic /config properties, and the community promoted them to first-class
spec elements.

On Mon, Apr 13, 2026 at 8:57 AM Ryan Blue <[email protected]> wrote:

> The FileFormat API is already. Plus, we keep the Iceberg spec separate
> from the Java API, although Java is the reference implementation. It
> doesn't make much sense to tie spec changes to a specific Java API
> (especially one that isn't user-facing).
>
> For properties like this, I think the best practice is to add them to
> implementation notes so that we can push to standardize across
> implementations.
>
> On Sat, Apr 11, 2026 at 5:13 PM Maninder Parmar <
> [email protected]> wrote:
>
>> +1 for adding the support.
>> Infact, it would be great if we could formalize a spec for file format
>> specification as part of table properties and potentially tie it together
>> with FileFormat API proposal. The concern with current approach is the
>> possibility of writing files in different parquet versions which could
>> render the table unreadable since table properties are not strictly
>> enforced or respected across engines.
>>
>> On Fri, Apr 10, 2026, 12:03 PM Anurag Mantripragada <
>> [email protected]> wrote:
>>
>>> Yeah, that seems reasonable to me.
>>>
>>> On Thu, Apr 2, 2026 at 8:08 AM Maximilian Michels <[email protected]>
>>> wrote:
>>>
>>>> Hi Harrison,
>>>>
>>>> I just read https://www.jeronimo.dev/the-two-versions-of-parquet/.
>>>> Adoption of the V2 spec seems to be low, but it makes sense to add an
>>>> option to configure it.
>>>>
>>>> Since this is a standard Parquet option, I don't think we need a
>>>> dedicated design document.
>>>>
>>>> Cheers,
>>>> Max
>>>>
>>>>
>>>> On Mon, Mar 30, 2026 at 7:54 PM Harrison Crosse <[email protected]>
>>>> wrote:
>>>> >
>>>> > Hi all,
>>>> >
>>>> > I opened #15677 to add a `write.parquet.page-version` table property
>>>> for configuring the Parquet DataPage version at the table level. iceberg-go
>>>> has already adopted this property (apache/iceberg-go#812). The iceberg-java
>>>> implementation is up for review in #15700.
>>>> >
>>>> > The scope felt small enough for a normal PR since it follows the
>>>> existing `write.parquet.*` pattern, but I wanted to check: does this need
>>>> an improvement proposal given it's a new spec-level property?
>>>> >
>>>> > Thanks,
>>>> > Harrison
>>>>
>>>

Reply via email to