Hi all, Seems like this thread is going stale, so I'm reviving it to ensure we have consensus on one side or another.
Currently, the vote stands as follows (please correct me if I'm wrong): * 1 vote for only emitting an event if the data is successfully committed to the table [Dmitri] * 1 vote for emitting a third type of event during a staged commit and then emitting the final AfterTableCommitEvent at the very end [Eric] * 2 votes indifferent [Adnan, Yufei] In either case, it seems we will need to ensure that AfterTableCommitEvent only emits if the commit is applied rather than staged, so I will look into that. But if anyone wants to break the tie here for emitting a third event type for a "staged commit", please do so. Else, I can lean towards adding the third event type and revert it out later, if needed. Best, Adnan Hemani On Fri, Nov 7, 2025 at 11:39 AM Yufei Gu <[email protected]> wrote: > Agreed that exactly-once delivery is extremely hard to achieve in > distributed systems because it requires atomic coordination across > producers, brokers, and consumers, essentially a distributed transaction. > Even frameworks that claim “exactly once” (like Kafka or Flink) rely on > idempotent processing and transactional checkpoints rather than true > one-time delivery. For Polaris, “at-least-once + idempotence” should be > more than sufficient(for auditing) without adding unnecessary complexity. > "Best-effort" is also acceptable for use cases like monitoring. > > Yufei > > > On Fri, Nov 7, 2025 at 12:39 AM Adnan Hemani > <[email protected]> wrote: > > > > From my POV, Polaris should deliver catalog-level events only for > > committed data. > > > > Thank you for sharing this opinion, Dmitri! As I previously mentioned, I > > can see merits of both ways so I don't have a hard opinion or leaning on > > this topic. I'm happy to tally the votes on this thread instead :) > > > > > We may want to consider delivering events for Storage-side operations > > performed by Polaris (e.g. when it writes a new file). These events may > be > > useful for driving async clean-up workflows. However, from my > perspective, > > these events would be distinct from the "catalog" events that currently > > exist in Polaris. > > > > This is a good idea IMO. We should totally work on this! > > > > Best, > > Adnan Hemani > > > > On Thu, Nov 6, 2025 at 11:29 AM Dmitri Bourlatchkov <[email protected]> > > wrote: > > > > > It looks like this discussion is going in the direction where we need > to > > > define the meaning of "before" / "after" events and expectations for > > their > > > listener callbacks. > > > > > > Actually the latter point probably affects the former. If events are > > > delivered only for committed data, I suppose the "before" / "after" > > > distinction will become irrelevant. > > > > > > From my POV, Polaris should deliver catalog-level events only for > > committed > > > data. > > > > > > Additionally, we can probably be lax about always delivering events. > That > > > is to say that events for some changes may be lost. Note that, events > may > > > be lost even with the current state of the codebase and I generally do > > not > > > think Polaris should venture into the "exactly once" delivery > territory. > > > > > > We may want to consider delivering events for Storage-side operations > > > performed by Polaris (e.g. when it writes a new file). These events may > > be > > > useful for driving async clean-up workflows. However, from my > > perspective, > > > these events would be distinct from the "catalog" events that currently > > > exist in Polaris. > > > > > > WDYT? > > > > > > Cheers, > > > Dmitri. > > > > > > On Wed, Nov 5, 2025 at 8:44 PM Adnan Hemani > > > <[email protected]> wrote: > > > > > > > Eric's suggestion also makes sense to me from the perspective of how > > > > Events behave today - and it may actually be a lot easier to > implement, > > > > including adding a third event for when the transaction is committed. > > > > > > > > I guess the question really comes down to: will it be confusing for > > users > > > > to see "AfterCommittedTableEvent" (with a parameter of "PENDING") > even > > > > though the transaction was rolled back? I can kind of see both sides > > > > here...would like to hear the rest of the community's thoughts on > this. > > > > > > > > Best, > > > > Adnan Hemani > > > > > > > > On Wed, Nov 5, 2025 at 3:27 PM Eric Maynard < > [email protected]> > > > > wrote: > > > > > > > > > Option (2) is definitely more in line with the original design for > > > events > > > > > -- we could add a third event when the transaction is committed, > and > > > then > > > > > the onus is on either the listener impl or the consumer of the > > listener > > > > > impl's output (whatever it may be) to stitch together the events > > based > > > on > > > > > the transaction ID or something else. > > > > > > > > > > --EM > > > > > > > > > > On Wed, Nov 5, 2025 at 2:54 PM Dmitri Bourlatchkov < > [email protected] > > > > > > > > wrote: > > > > > > > > > > > Hi Adnan, > > > > > > > > > > > > I agree that this is a bug. Here's the GH issue: > > > > > > https://github.com/apache/polaris/issues/2990 > > > > > > > > > > > > I personally do _not_ think the onus for fixing this is on you :) > > The > > > > > code > > > > > > pattern that led to this issue has existed in Apache Polaris code > > > since > > > > > day > > > > > > 1... but if you want to give it a try, please feel free. > > > > > > > > > > > > Option 1 from Yufei's email sounds like the right general > approach > > to > > > > > > fixing this problem. I also agree that actual implementation may > be > > > > > tricky. > > > > > > It may be worth discussing an outline of a fix in this thread > > before > > > > > > starting a PR just to avoid rework in case there's no consensus > on > > > fix > > > > > > details. I'm specifically concerned about the APIs used in this > > case > > > > and > > > > > > how Iceberg catalog transactions map to atomic Persistence > > operations > > > > > (cf. > > > > > > [1]). > > > > > > > > > > > > [1] > > https://lists.apache.org/thread/rf5orxs815zs4h64p4rwp03q3pbgxb5r > > > > > > > > > > > > Cheers, > > > > > > Dmitri. > > > > > > > > > > > > On Wed, Nov 5, 2025 at 4:57 PM Adnan Hemani > > > > > > <[email protected]> wrote: > > > > > > > > > > > > > Hi Dmitri, > > > > > > > > > > > > > > Yes, this is indeed a bug and we should create a GitHub issue > for > > > > this > > > > > to > > > > > > > track fixing it. Ideally, we should be doing something similar > to > > > > > Option > > > > > > 1 > > > > > > > of what Yufei suggested - but how this will be done may be > harder > > > > than > > > > > it > > > > > > > looks. I can investigate this and I encourage others to look > into > > > it > > > > as > > > > > > > well, if this is of interest. > > > > > > > > > > > > > > Best, > > > > > > > Adnan Hemani > > > > > > > > > > > > > > On Wed, Nov 5, 2025 at 8:29 AM Yufei Gu <[email protected]> > > > > wrote: > > > > > > > > > > > > > > > Hi Dmitri, > > > > > > > > > > > > > > > > Good catch! This is a subtle but important issue. Thanks for > > > > raising > > > > > > it. > > > > > > > I > > > > > > > > think we could handle it in a few ways: > > > > > > > > > > > > > > > > 1. Hold the event emission until the all multi-table commit > > > > succeeds, > > > > > > and > > > > > > > > buffering per-table events until persistence confirms > success. > > > > > > > > 2. Include a transaction ID and status (e.g., pending, > > committed, > > > > > > > aborted) > > > > > > > > in emitted events so consumers can filter accordingly. This > > will > > > > add > > > > > > > burden > > > > > > > > to downstreams. I think we could figure out a way to filter > out > > > > while > > > > > > > > persisting them, so that the real consumers won't see the > > events > > > > with > > > > > > > > aborted status. > > > > > > > > > > > > > > > > I'm not sure which way is better at this moment, we need to > > take > > > a > > > > > deep > > > > > > > > look to evaluate both. > > > > > > > > > > > > > > > > Yufei > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Nov 5, 2025 at 8:06 AM Dmitri Bourlatchkov < > > > > [email protected] > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi All, > > > > > > > > > > > > > > > > > > I'd like to highlight an aspect of the current Events > > behaviour > > > > > with > > > > > > > > > respect to multi-table commits that Christopher and I > > > discovered > > > > > > today. > > > > > > > > The > > > > > > > > > issue existed since the first Events implementation, AFAIK, > > it > > > > just > > > > > > did > > > > > > > > not > > > > > > > > > come into focus until now, AFAIK. > > > > > > > > > > > > > > > > > > Consider a multi-table commit request [1]. > > > IcebergCatalogHandler > > > > > will > > > > > > > run > > > > > > > > > each individual table through a separate commit operation > on > > > the > > > > > base > > > > > > > > > catalog [2]. The base catalog will issue events for each > > table > > > > > > > separately > > > > > > > > > [3][4]. However, the overall commit to Polaris Persistence > > may > > > > > still > > > > > > > > fail, > > > > > > > > > e.g. due to concurrent updates [5]. > > > > > > > > > > > > > > > > > > Now, we can have a situation when both before/after events > > are > > > > > > > delivered > > > > > > > > > for a table, but the actual change that triggered the > events > > is > > > > > _not_ > > > > > > > > > persisted, therefore does not exist in the current state of > > the > > > > > > Polaris > > > > > > > > > catalog. > > > > > > > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/polaris/blob/f934443114251f85d18c9a51ed61fc49a500a61a/runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java#L973 > > > > > > > > > > > > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/polaris/blob/f934443114251f85d18c9a51ed61fc49a500a61a/runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java#L1051 > > > > > > > > > > > > > > > > > > [3] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/polaris/blob/f934443114251f85d18c9a51ed61fc49a500a61a/runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalog.java#L1405 > > > > > > > > > > > > > > > > > > [4] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/polaris/blob/f934443114251f85d18c9a51ed61fc49a500a61a/runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalog.java#L1558 > > > > > > > > > > > > > > > > > > [5] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/polaris/blob/f934443114251f85d18c9a51ed61fc49a500a61a/runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java#L1058 > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > Dmitri. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
