Just wanted to bump this. Ryan, since you had objections to the regular PR would you mind taking a look before we start a vote?
Thanks, Micah On Tue, Sep 30, 2025 at 10:01 AM Micah Kornfield <emkornfi...@gmail.com> wrote: > As an update Nicolae did a first pass of reviewing > https://github.com/apache/iceberg/pull/14004/files (Thank you). Any > additional feedback is appreciated. > > Thanks, > Micah > > On Sun, Sep 7, 2025 at 9:53 PM Micah Kornfield <emkornfi...@gmail.com> > wrote: > >> https://github.com/apache/iceberg/pull/14004 is an attempt to update >> based on Ryan's feedback. >> >> >> Nicolae some thoughts below: >> >> Interpreting this for the Iceberg REST Catalog interaction: writers >>> should include the assert-current-schema-id requirement when >>> committing to a table. Otherwise, inconsistencies may be introduced. I >>> might be on a slightly outdated Iceberg build, but from what I can >>> see, Spark with Iceberg doesn’t include this requirement during table >>> commits (example: >>> https://gist.github.com/nvartolomei/2fd6e994d1d9b5d61597c16339100518). >> >> >> As outlined in the new PR I believe this depends on: >> 1. Isolation level >> 2. The operation being performed. >> >> Ignoring changing nullable fields to non-nullable (which isn't covered in >> the spec), I think there are times when using out of date schemas is fine >> as long as one is not touching default value or one is using Snapshot >> isolation. More details in the PR, please let me know if it makes sense. >> >> Do we consider this an implementation bug then? >> >> >> I'm not familiar enough with the implementation to say. >> >> Also, what should a sensible implementation do when the requirement >>> fails? Reload table schema and check if new schema is compatible with >>> writer schema (i.e. it wouldn't be subject to the possible >>> inconsistencies as described by the spec) and otherwise rewrite the >>> data? For example rewrite parquet files by adding new all-null fields >>> if needed, etc. >> >> >> This seems like the sensible thing to do for serialized isolation. >> >> Cheers, >> Micah >> >> On Fri, Sep 5, 2025 at 10:55 AM Micah Kornfield <emkornfi...@gmail.com> >> wrote: >> >>> https://github.com/apache/iceberg/pull/14002 for revert, I'll post a >>> new PR which try to accommodate the replies on the thread. >>> >>> On Fri, Sep 5, 2025 at 8:53 AM Ryan Blue <rdb...@gmail.com> wrote: >>> >>>> Would anyone object to reverting this change? I don't think that it >>>> makes accurate statements. For example, I don't think that "not writing out >>>> all columns or not using the latest schema can change the semantics of the >>>> data written" is correct. It is perfectly fine to append with an older >>>> schema and writers do this today. Saying that may "change the semantics of >>>> the data written" makes this much scarier than it actually is. >>>> >>>> I think this confuses cases and errs too strongly on discouraging >>>> reasonable things. It is, of course, bad to use an older schema that drops >>>> columns from rows that are being updated. But that's a problem with the >>>> engine choosing to drop current data columns more than it is a schema >>>> selection issue. >>>> >>>> I think that the right path here is to state that operations should be >>>> bound to the current schema when they are planned. It is okay to continue >>>> producing data with an older schema in currently-running jobs. And we >>>> should call out the behavior of defaults. But I don't think that we should >>>> use language that clearly states what will happen rather than a generic >>>> "may change the semantics of data written". >>>> >>>> On Fri, Sep 5, 2025 at 5:05 AM Nicolae Vartolomei >>>> <n...@nvartolomei.com.invalid> wrote: >>>> >>>>> Quoting the newly added text in the spec: >>>>> >>>>> > Writers must write out all fields with the types specified from a >>>>> schema present in table metadata. Writers should use the latest schema for >>>>> writing. Not writing out all columns or not using the latest schema can >>>>> change the semantics of the data written. The following are possible >>>>> inconsistencies that can be introduced: >>>>> >>>>> Interpreting this for the Iceberg REST Catalog interaction: writers >>>>> should include the assert-current-schema-id requirement when >>>>> committing to a table. Otherwise, inconsistencies may be introduced. I >>>>> might be on a slightly outdated Iceberg build, but from what I can >>>>> see, Spark with Iceberg doesn’t include this requirement during table >>>>> commits (example: >>>>> https://gist.github.com/nvartolomei/2fd6e994d1d9b5d61597c16339100518). >>>>> >>>>> Do we consider this an implementation bug then? >>>>> >>>>> Also, what should a sensible implementation do when the requirement >>>>> fails? Reload table schema and check if new schema is compatible with >>>>> writer schema (i.e. it wouldn't be subject to the possible >>>>> inconsistencies as described by the spec) and otherwise rewrite the >>>>> data? For example rewrite parquet files by adding new all-null fields >>>>> if needed, etc. >>>>> >>>>> On Thu, Sep 4, 2025 at 10:19 PM Micah Kornfield <emkornfi...@gmail.com> >>>>> wrote: >>>>> > >>>>> > Just to follow-up the PR ended up getting merged. In theory, there >>>>> maybe should have been a vote, but unless someone feels strongly or would >>>>> like to fine tune the language further perhaps we can consider the topic >>>>> resolved? >>>>> > >>>>> > On Wed, Aug 27, 2025 at 2:17 PM Micah Kornfield < >>>>> emkornfi...@gmail.com> wrote: >>>>> >> >>>>> >> I opened https://github.com/apache/iceberg/pull/13936 as a draft >>>>> proposal to capture the conversation. >>>>> >> >>>>> >> BTW, I think one area this brings up that I don't think the >>>>> specification handles is changing between nullable and not-nullable >>>>> fields. >>>>> Outdated schemas have some implications in these cases as well. >>>>> >> >>>>> >> Cheers, >>>>> >> Micah >>>>> >> >>>>> >> On Tue, Aug 26, 2025 at 10:13 AM Micah Kornfield < >>>>> emkornfi...@gmail.com> wrote: >>>>> >>> >>>>> >>> I think the original question is ambiguous. We should probably >>>>> subset this into two questions: >>>>> >>> >>>>> >>> 1. Is it OK to write out an "int" instead of a "long" if the >>>>> writer's schema says the value is a long? >>>>> >>> >>>>> >>> I think the answer here is we recommended not doing so, even >>>>> though it would likely work. >>>>> >>> >>>>> >>> 2. Is it OK to use an older schema for writing? >>>>> >>> >>>>> >>> The consensus on the thread seems to be yes. I'll note that this >>>>> can cause confusing results when the "write-default" [1] value for a >>>>> column >>>>> changes. We should probably have an implementation note to clarify: >>>>> >>> a. Using a stale schema is allowed >>>>> >>> b. It might cause inconsistent results in the face of multiple >>>>> writers when default values are used. >>>>> >>> >>>>> >>> Thoughts? >>>>> >>> >>>>> >>> Thanks, >>>>> >>> Micah >>>>> >>> >>>>> >>> On Mon, Aug 25, 2025 at 4:59 PM Ryan Blue <rdb...@gmail.com> >>>>> wrote: >>>>> >>>> >>>>> >>>> I agree with Dan that type promotion should be well-defined. If >>>>> it's a grey area then we should clarify it in the spec. >>>>> >>>> >>>>> >>>> How it works today is that schema evolution always produces a >>>>> schema that can read files written with any older schema. When a type is >>>>> promoted, the new schema can read any older data file, but readers may >>>>> need >>>>> to promote values like the [int-to-long reader]( >>>>> https://github.com/apache/iceberg/blob/main/parquet/src/main/java/org/apache/iceberg/parquet/ParquetValueReaders.java#L546-L560) >>>>> does. You aren't guaranteed to be able to read new data using an older >>>>> schema, so the latest schema should always be used or you should use the >>>>> schema attached to a snapshot. >>>>> >>>> >>>>> >>>> Because files with older schemas can always be read, it is safe >>>>> to write files with an older schema. This happens fairly regularly, as >>>>> Steven noted, in cases where a writer has a fixed schema and is >>>>> long-running. >>>>> >>>> >>>>> >>>> Ryan >>>>> >>>> >>>>> >>>> On Thu, Aug 21, 2025 at 5:37 PM Steven Wu <stevenz...@gmail.com> >>>>> wrote: >>>>> >>>>> >>>>> >>>>> > This means that you can have writers using different schema to >>>>> write (use cases include different partitioning or "out-of-date" writers), >>>>> but the data is still valid. >>>>> >>>>> >>>>> >>>>> +1 on Dan's point. Both batch and streaming writers can have >>>>> stale schema. long-running streaming jobs may stay stale for extended >>>>> periods before picking up the new schema during restart. >>>>> >>>>> >>>>> >>>>> On Wed, Aug 20, 2025 at 2:50 PM Daniel Weeks <dwe...@apache.org> >>>>> wrote: >>>>> >>>>>> >>>>> >>>>>> I think I'm going to disagree and argue that it's not really a >>>>> gray area. >>>>> >>>>>> >>>>> >>>>>> Having strict schema evolution rules and how schema's are >>>>> tracked means that there is independence between writer and reader schemas >>>>> which remain compatible due to the evolution rules. >>>>> >>>>>> >>>>> >>>>>> This means that you can have writers using different schema to >>>>> write (use cases include different partitioning or "out-of-date" writers), >>>>> but the data is still valid. >>>>> >>>>>> >>>>> >>>>>> How you promote physical representation during a read/scan >>>>> operation results in a consistent presentation with the read schema. >>>>> >>>>>> >>>>> >>>>>> All of the representations are technically valid. >>>>> >>>>>> >>>>> >>>>>> -Dan >>>>> >>>>>> >>>>> >>>>>> On Mon, Aug 18, 2025 at 7:46 AM Russell Spitzer < >>>>> russell.spit...@gmail.com> wrote: >>>>> >>>>>>> >>>>> >>>>>>> +1 to what Micah said :) sorry about the typo >>>>> >>>>>>> >>>>> >>>>>>> On Mon, Aug 18, 2025 at 9:45 AM Russell Spitzer < >>>>> russell.spit...@gmail.com> wrote: >>>>> >>>>>>>> >>>>> >>>>>>>> +1 to what Micaah , We have never really written rules about >>>>> what is "allowed" in this particular context but since >>>>> >>>>>>>> a reader needs to be able to handle both int/long values for >>>>> the column, there isn't really any danger in writing >>>>> >>>>>>>> new files with the narrower type. If a reader couldn't handle >>>>> this, then type promotion would be impossible. >>>>> >>>>>>>> >>>>> >>>>>>>> I would include all columns in the file, the space >>>>> requirements for an all null column (or all constant column) should >>>>> >>>>>>>> be very small. I believe the reason we original wrote those >>>>> rules in was to avoid folks doing the Hive Style >>>>> >>>>>>>> implicit columns from partition tuple (although we also have >>>>> handling for this.) >>>>> >>>>>>>> >>>>> >>>>>>>> On Sun, Aug 17, 2025 at 11:15 PM Micah Kornfield < >>>>> emkornfi...@gmail.com> wrote: >>>>> >>>>>>>>> >>>>> >>>>>>>>> >>>>> >>>>>>>>> Hi Nic, >>>>> >>>>>>>>> This is IMO a gray area. >>>>> >>>>>>>>> >>>>> >>>>>>>>>> However, is it allowed to commit *new* parquet files with >>>>> the old >>>>> >>>>>>>>>> types (int) and commit them to the table with a table >>>>> schema where >>>>> >>>>>>>>>> types are promoted (long)? >>>>> >>>>>>>>> >>>>> >>>>>>>>> >>>>> >>>>>>>>> IMO I would expect writers to be writing files that are >>>>> consistent with the current metadata, so ideally they would not be written >>>>> with int if it is now long. In general, though in these cases I think >>>>> most >>>>> readers are robust to reading type promoted files. We should probably >>>>> clarify in the specification. >>>>> >>>>>>>>> >>>>> >>>>>>>>> >>>>> >>>>>>>>>> Also, is it allowed to commit parquet files, in general, >>>>> which contain >>>>> >>>>>>>>>> only a subset of columns of table schema? I.e. if I know a >>>>> column is >>>>> >>>>>>>>>> all NULLs, can we just skip writing it? >>>>> >>>>>>>>> >>>>> >>>>>>>>> >>>>> >>>>>>>>> As currently worded the spec on writing data files ( >>>>> https://iceberg.apache.org/spec/#writing-data-files) should include >>>>> all columns. Based on column projection rules, however, failing to do so >>>>> should also not cause problems. >>>>> >>>>>>>>> >>>>> >>>>>>>>> Cheers, >>>>> >>>>>>>>> Micah >>>>> >>>>>>>>> >>>>> >>>>>>>>> On Fri, Aug 15, 2025 at 8:45 AM Nicolae Vartolomei >>>>> <n...@nvartolomei.com.invalid> wrote: >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> Hi, >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> I'm implementing an Iceberg writer[^1] and have a question >>>>> about what >>>>> >>>>>>>>>> type promotion actually means as part of schema evolution >>>>> rules. >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> Iceberg spec [specifies][spec-evo] which type promotions >>>>> are allowed. >>>>> >>>>>>>>>> No confusion there. >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> The confusion on my end arises when it comes to actually >>>>> writing i.e. >>>>> >>>>>>>>>> parquet data. Let's take for example the int to long >>>>> promotion. What >>>>> >>>>>>>>>> is actually allowed under this promotion rule? Let me try >>>>> to show what >>>>> >>>>>>>>>> I mean. >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> Obviously if I have a schema-id N with field A of type int >>>>> and table >>>>> >>>>>>>>>> snapshots with this schema then it is possible to update >>>>> the table >>>>> >>>>>>>>>> schema-id to > N where field A now has type long and this >>>>> new schema >>>>> >>>>>>>>>> can read parquet files with the old type. >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> However, is it allowed to commit *new* parquet files with >>>>> the old >>>>> >>>>>>>>>> types (int) and commit them to the table with a table >>>>> schema where >>>>> >>>>>>>>>> types are promoted (long)? >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> Also, is it allowed to commit parquet files, in general, >>>>> which contain >>>>> >>>>>>>>>> only a subset of columns of table schema? I.e. if I know a >>>>> column is >>>>> >>>>>>>>>> all NULLs, can we just skip writing it? >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> Appreciate taking the time to look at this, >>>>> >>>>>>>>>> Nic >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> [spec-evo]: >>>>> https://iceberg.apache.org/spec/#schema-evolution >>>>> >>>>>>>>>> [^1]: This is for Redpanda to Iceberg native integration >>>>> >>>>>>>>>> (https://github.com/redpanda-data/redpanda). >>>>> >>>>