Thank you for the response, Farooq! Sorry for my delayed reply. This is
great context to have.

I took a look through the issue and the PR. I like the idea, but I think
I'm missing a couple things of the big picture here:

1. It seems to omit addressing guidance against using the table properties
field for coordination:

In the linked issue, it seemed like the guidance was that we should not use
table properties to coordinate state, in favor of using snapshot
properties. Has that concern been ironed out? Is the conclusion that the
community actually does want to give table properties transactional
semantics?

Iceberg-adjacent systems may _need_ to coordinate state, regardless of
whether it's in the snapshot properties, table properties, or some new
field of the metadata. Snapshot properties are nice because snapshots have
transactional guarantees, but fall victim to the issues I've described in
my original question. I'm curious if there has been any more discussion
about introducing transactional table-wide metadata (whether in table
properties or otherwise), or whether this is a firm philosophical stance
that the Iceberg project has taken against table-level coordination.

2. It could use a bit more detail to be viable for today's Iceberg
landscape, particularly for non-Java projects:

The PR aims to allow _any_ checks to be made against the table. While this
is powerful, given the interface appears to be tied to Java Predicates, it
seems not very ecosystem-friendly in its current form. Iceberg today is
used by projects written in several languages, and general guidance is to
serialize transactions and their requirements to the IRC. Our project is
one example of this (our codebase is entirely C++). For this to be useful
more broadly, I'd expect us to need to at least define the IRC table
requirements that go along with these validations. If I've missed where
this is happening in the PR though, please let me know!


I think we need to align on point 1 as a community before reviving this
work. For instance, if we're accepting transactional guarantees for table
properties via conditional validators, would it make more sense to make an
explicit transactional table properties update? I'm not sure if the growth
of the community has changed its stance about this since the approach was
originally discussed, but I'm happy to help get closure there. I'm curious
if you or others have thoughts on this.


Thanks again,
Andrew

On Mon, Jul 28, 2025 at 7:25 PM fqossify <fqoss...@gmail.com> wrote:

> Hello Andrew,
>
> I think you'll find the discussion in this GitHub issue [1] very relevant
> to your problem.
> In fact, this problem has come up a few times in the Iceberg community and
> I think now would be a good time to revisit this discussion.
>
> > It may be useful in similar situations to update the table only
> if certain metadata fields have not changed, without tying these fields to
> specific snapshots.
>
> I raised a PR [2] previously to implement precisely this.
> Unfortunately, progress on the PR stalled after a few rounds of reviews
> due to limited reviewer bandwidth at the time.
> I would be happy to revive the PR if the community agrees that this is
> still the right way to solve this issue.
>
> Best wishes,
> Farooq
>
> [1] https://github.com/apache/iceberg/issues/6514
> [2] https://github.com/apache/iceberg/pull/6513
>
> On Fri, Jul 25, 2025 at 12:06 PM Andrew Wong <aw...@redpanda.com> wrote:
>
>> BLUF:
>> - We are using snapshot references to preserve custom table-level
>> metadata that
>>   currently exists in snapshot summaries. Is this an anti-pattern or
>> expected
>>   usage?
>> - If it is an anti-pattern, is there something else in the spec we can
>> use for
>>   this purpose? If not, would it make sense to introduce table-level
>> metadata
>>   in the spec?
>>
>> Details below:
>>
>> Hello Iceberg community,
>>
>> We (Redpanda[1][2]) have built a log storage engine that, in addition to
>> writing log format data, writes data as Parquet files and commits them to
>> the
>> Iceberg catalog. One of the requirements we have is to ensure exactly once
>> delivery of records into Iceberg. To this end, we keep metadata in two
>> places:
>> - In the Iceberg table, we add the position in our log up to which has
>> been
>>   committed as a field in each new Iceberg snapshot’s summary.
>> - In our system, we checkpoint this same position up to which we have
>> committed
>>   to Iceberg.
>>
>> It’s possible for these to diverge (e.g. in the event of a node failure in
>> between the above two events), but in such cases, the Iceberg table is
>> taken as
>> the source of truth. As I understand it, this is the same technique the
>> Kafka
>> Connect connector uses.
>>
>> But there is a problem with this approach when considering snapshot expiry
>> alongside concurrent updates from multiple systems. While the default
>> snapshot
>> expiration is 5 days, it’s conceivable a user sets the table’s snapshot
>> expiry
>> to something significantly lower to avoid metadata bloat. To boot, we
>> cannot
>> assume that our system is the only system writing to Iceberg, and the main
>> snapshot is the only snapshot guaranteed to be retained at all times.
>> It’s thus
>> conceivable that external systems add snapshots to the table, and for
>> snapshot
>> expiry to remove the snapshot metadata we require. If these conditions
>> are met
>> in a moment of divergence, there is room for exactly once delivery to be
>> violated and for files to be committed to the table more than once.
>>
>> To mitigate this, we maintain an Iceberg tag for the latest snapshot
>> written by
>> our system, and rely on the snapshot reference expiry policy[3] to ensure
>> that
>> these tagged snapshots aren’t removed, with the assumption that it is more
>> likely to tune down the `max-snapshot-age-ms` property (to keep manifest
>> list
>> size small) than it is to tune down the `max-ref-age-ms` property.
>>
>> There are still at least a couple issues with this approach:
>> - A user can still set `max-ref-age-ms` to something pathologically small
>> and
>>   end up causing an exactly-once violation.
>> - It feels like we’re overloading the intended behavior of tags by using
>> them
>>   to force explicit snapshot retention.
>>
>> Our question is, is there anything better that we can be doing here? Are
>> there
>> other parts of the spec that can serve our needs? Table properties field
>> seems
>> somewhat what we want, but:
>> - It is explicitly described as being not meant for arbitrary metadata[4].
>> - For it to be useful for our use case, we'd need some kind of table
>>   requirement that checks these properties atomically (today, we use
>>   snapshot-based table requirements when we commit).
>>
>> So if not something existing, do folks have thoughts on generalized ways
>> to
>> store custom metadata in the table? As an example, is there any appetite
>> in
>> adding a different table-level metadata field to the spec? As Iceberg
>> becomes
>> adopted by more systems, it's not hard to imagine similar requirements
>> popping
>> up elsewhere. It may be useful in similar situations to update the table
>> only
>> if certain metadata fields have not changed, without tying these fields to
>> specific snapshots.
>>
>>
>> Thanks,
>> Andrew
>>
>> [1] https://www.redpanda.com/
>> [2] https://github.com/redpanda-data/redpanda
>> [3] https://iceberg.apache.org/spec/#snapshot-retention-policy
>> [4] https://iceberg.apache.org/spec/#table-metadata-fields
>>
>

Reply via email to