Thanks for the info, it is very helpful. I see it debugging down through `org.apache.iceberg.ManifestReader#readMetadata`. It wasn't obvious to me that this sort of data would be in the avro metadata as opposed to the org.apache.iceberg.ManifestFile object. I may have some questions later about the writing side of the equation in these regards...
BTW, it looks like either the spec is incorrect, or the java implementation is incorrect; I see `schema` being written to the manifest header metadata, but not `schema-id`. https://github.com/apache/iceberg/blob/apache-iceberg-1.8.0/core/src/main/java/org/apache/iceberg/ManifestWriter.java#L346-L355 https://github.com/apache/iceberg/blob/apache-iceberg-1.8.0/core/src/main/java/org/apache/iceberg/ManifestWriter.java#L312-L321 On Fri, Feb 14, 2025 at 10:26 AM Fokko Driesprong <fo...@apache.org> wrote: > Hi Devin, > > The schema-id is stored in the Manifest Avro header: > https://iceberg.apache.org/spec/#manifests Also the schema itself is > stored there. Would that help your situation? I think this makes adding it > to the data file redundant. > > Kind regards, > Fokko > > Op vr 14 feb 2025 om 17:56 schreef Devin Smith > <devinsm...@deephaven.io.invalid>: > >> I want to make sure I'm not missing something that already exists; >> otherwise, hoping to get a quick thumbs up / thumbs down on a potential >> proposal before spending more time on it. >> >> It would be nice to know what Iceberg schema a writer used (/assumed) >> when writing a DataFile. Oftentimes, this information is written into the >> parquet file's metadata, but it would be great if Iceberg provided this >> directly. A schema_id on DataFile would be nice, I think. >> >> Thanks, >> -Devin >> >