The `format-version` table property is different because it is mapped to the format version that is not stored in table properties. It is reserved because implementations will override it and so it isn't a real table property. This is not a pattern that we want to expand because of the strange behavior.
For cases like `comment`, these other properties are normal table properties that can be used like any other. If the schema had a doc string and that was used in place of `comment`, then I think it would be a reserved property. But there's no need for that because setting the property or using `COMMENT ON` would have the same behavior -- changing the property value. The `owner` property is a different case. Owner is something that should be restricted. A user should not be able to change it with just access to modify table metadata. Tracking a table's owner is the responsibility of the catalog and its access control scheme. Because of this, I don't think that we should standardize or encourage setting an `owner` table property. On Tue, Aug 5, 2025 at 4:21 AM Guy Yasoor <guy.yas...@ryft.io.invalid> wrote: > If using "comment" is the best practice, should we add this to the "reserved > table properties" docs > <https://iceberg.apache.org/docs/latest/configuration/#reserved-table-properties>, > to make sure it's aligned between different engines and implementations? > In the same opportunity, I would suggest adding "owner" as well, which is > automatically added by Spark. > > On Tue, Aug 5, 2025 at 2:16 AM Taeyun Kim <taeyun....@innowireless.com> > wrote: > >> Hi, >> >> I see, thank you for your response. >> >> Best regards, >> Taeyun >> >> -----Original Message----- >> From: "Ryan Blue" <rdb...@gmail.com> >> To: <dev@iceberg.apache.org>; >> Cc: >> Sent: 2025-08-05 (화) 07:45:43 (UTC+09:00) >> Subject: Re: Re: Thoughts on Adding a `doc` Property for Schema Objects >> >> >> If there isn't a significant difference between table-level >> description and schema-level description, then I think you should consider >> it standardized. You can store the table description in the "comment" table >> property. >> >> >> On Sun, Aug 3, 2025 at 5:28 PM Taeyun Kim <taeyun....@innowireless.com> >> wrote: >> Hi, >> >> I’ve already explained my reasoning in earlier messages, including the >> example about making table and column descriptions more accessible for >> LLM‑generated SQL. >> From my perspective, table‑level comments, like column‑level comments, >> should also be standardized. >> If standardized, it seems natural for them to be part of the schema >> definition, just like column‑level comments. >> This way, they stay consistent with the schema version and avoid drifting >> out of sync when the schema changes. >> >> Thanks, >> Taeyun >> >> >> -----Original Message----- >> From: "Ryan Blue" <rdb...@gmail.com> >> To: <dev@iceberg.apache.org>; >> Cc: >> Sent: 2025-07-26 (토) 08:05:55 (UTC+09:00) >> Subject: Re: Thoughts on Adding a `doc` Property for Schema Objects >> >> >> Why would you need to version table descriptions? Are there cases where >> they are changing rapidly and inaccurate due to schema changes? >> >> >> On Thu, Jul 24, 2025 at 7:48 PM Taeyun Kim <taeyun....@innowireless.com> >> wrote: >> >> Thank you for your reply. >> >> Column-level comments are already part of the schema definition. Would >> adding just one table-level comment really cause noticeable bloat? For >> example, if a table has 20 columns, adding one more comment would only >> increase the metadata size by about 1/20th. >> >> Also, using schema-id as part of the property key feels like a workaround >> rather than a proper solution. It is not part of the specification, so any >> tool or integration (including LLM-based ones) would need extra logic to >> interpret it. A standardized, schema-level field would avoid that >> complexity and make the metadata easier to consume consistently. >> >> If bloat is a real concern, perhaps column-level comments should also be >> moved out of the schema, with a proper mechanism to version and manage them >> separately. >> >> Thank you, >> Taeyun. >> >> -----Original Message----- >> From: "Gang Wu" <ust...@gmail.com> >> To: <dev@iceberg.apache.org>; >> Cc: >> Sent: 2025-07-25 (금) 11:20:08 (UTC+09:00) >> Subject: Re: Thoughts on Adding a `doc` Property for Schema Objects >> >> >> I'd rather not complicate the schema definitions in the table metadata. >> You may append `schema-id` to the key of table property to manage different >> schema versions. >> >> >> Storing verbose text to each field may bloat the metadata storage, >> especially when there are a lot of duplicate `doc`s if schema evolution >> happens a lot. >> >> >> Best, >> Gang >> >> >> On Fri, Jul 25, 2025 at 9:25 AM Taeyun Kim <taeyun....@innowireless.com> >> wrote: >> >> Thank you for your response. >> As I understand it, the table description is currently stored as a table >> property within the table metadata’s `properties` map. >> >> In my opinion, this approach has a few issues: >> >> - Table metadata `properties` are not versioned. As a result, when >> querying an older snapshot, the description may be inaccurate because the >> value reflects only the current state. >> - According to the specification, the purpose of table metadata >> properties is: “A string to string map of table properties. This is used to >> control settings that affect reading and writing and is not intended to be >> used for arbitrary metadata.” Based on this, a comment seems to fall under >> “arbitrary metadata,” and therefore may not be an appropriate use of >> properties. >> - Table comments seem to have become significant enough that relying on a >> convention alone may no longer be sufficient. It might be worth considering >> a standardized, schema-level field for them. >> >> Thank you. >> Taeyun >> >> -----Original Message----- >> From: "Ryan Blue" <rdb...@gmail.com> >> To: <dev@iceberg.apache.org>; >> Cc: >> Sent: 2025-07-25 (금) 08:48:48 (UTC+09:00) >> Subject: Re: Thoughts on Adding a `doc` Property for Schema Objects >> >> >> Iceberg does allow you to store table descriptions. The convention is to >> use a table property, "comment". While this isn't a schema-level >> doc/comment, I don't know of anything that makes a distinction between >> schema description and table description, so I think it should work for >> your use. >> >> >> >> On Tue, Jul 22, 2025 at 5:48 PM 김태연 (Taeyun Kim) < >> taeyun....@innowireless.com> wrote: >> >> Hi, >> >> With the growing trend of using LLMs to automatically generate SQL, it >> feels increasingly important to manage descriptions of database tables and >> columns in a way that these tools can easily access. >> >> In the Iceberg specification, comments for schema fields (i.e., columns) >> can be specified using the `doc` property within the `fields` array of a >> `struct` type. However, there doesn’t seem to be a way to specify a comment >> for the root struct type itself - that is, for the table as a whole. >> >> From what I can tell, OLAP DBMSs today may handle table-level comments by >> storing them in the `properties` map within the table metadata under >> various non-standard keys. But since a table comment conceptually belongs >> to the schema, and can vary by schema, it feels like the `properties` map >> within the table metadata might not be the best place for it. >> >> Would it make sense to allow a `doc` property on the `schema` object (the >> root struct type), alongside `schema-id` and `identifier-field-ids`, so >> that a description for the schema itself can be included? >> It seems like it would be helpful, especially for tooling and LLM-related >> use cases. >> >> Curious to hear your thoughts. >> Apologies if I’m overlooking something or if this has already been >> discussed. >> >> Thank you, >> Taeyun > >