Hi Dmitri,

I've tried to make the change as less intrusive as possible, so I don't
think there is a SPI change here - but I'll let you be the judge of that
once I release the PR (hopefully later tonight). That being said, I'm not
sure a new README is also warranted either - but let's chat about this on
the PR itself. I'm happy to add one, but just not sure where we would add
it and what that content it would contain.

Best,
Adnan Hemani

On Mon, Nov 17, 2025 at 9:00 AM Dmitri Bourlatchkov <[email protected]>
wrote:

> Thanks for looking into this, Adnan!
>
> Regardless of whether we go with emitting events only for successful
> commits or emitting phantom events with a separate "commit" events, I
> believe it is very important to clarify events behaviour with respect to
> REST API-side table changes (commits), and specifically in case of
> multi-table commits.
>
> From my side, I'm not attached to any particular implementation approach
> (my preference stated previously), but I do believe that event consumers
> should be able to reconstruct what actually changed (got committed) in the
> catalog with relative ease (i.e. not requiring extensive post-processing of
> events data).
>
> I'd propose to add a dedicated README or docs page for that... Ideally in
> the same PR that proposes a fix for this issue (so that the impl. code
> changes could be reviews with the intended semantics in mind). WDYT?
>
> Also, since it's going to be an API/SPI change in the end (I'm pretty
> sure). Let's reserve some extra review time for code and docs changes. I'd
> expect the design discussion may reopen once we have a concrete proposal in
> place.
>
> Thanks,
> Dmitri.
>
> On Fri, Nov 14, 2025 at 6:38 PM Adnan Hemani
> <[email protected]> wrote:
>
> > Hi all,
> >
> > Seems like this thread is going stale, so I'm reviving it to ensure we
> have
> > consensus on one side or another.
> >
> > Currently, the vote stands as follows (please correct me if I'm wrong):
> > * 1 vote for only emitting an event if the data is successfully committed
> > to the table [Dmitri]
> > * 1 vote for emitting a third type of event during a staged commit and
> then
> > emitting the final AfterTableCommitEvent at the very end [Eric]
> > * 2 votes indifferent [Adnan, Yufei]
> >
> > In either case, it seems we will need to ensure that
> AfterTableCommitEvent
> > only emits if the commit is applied rather than staged, so I will look
> into
> > that. But if anyone wants to break the tie here for emitting a third
> event
> > type for a "staged commit", please do so. Else, I can lean towards adding
> > the third event type and revert it out later, if needed.
> >
> > Best,
> > Adnan Hemani
> >
> > On Fri, Nov 7, 2025 at 11:39 AM Yufei Gu <[email protected]> wrote:
> >
> > > Agreed that exactly-once delivery is extremely hard to achieve in
> > > distributed systems because it requires atomic coordination across
> > > producers, brokers, and consumers, essentially a distributed
> transaction.
> > > Even frameworks that claim “exactly once” (like Kafka or Flink) rely on
> > > idempotent processing and transactional checkpoints rather than true
> > > one-time delivery. For Polaris, “at-least-once + idempotence” should be
> > > more than sufficient(for auditing) without adding unnecessary
> complexity.
> > > "Best-effort" is also acceptable for use cases like monitoring.
> > >
> > > Yufei
> > >
> > >
> > > On Fri, Nov 7, 2025 at 12:39 AM Adnan Hemani
> > > <[email protected]> wrote:
> > >
> > > > > From my POV, Polaris should deliver catalog-level events only for
> > > > committed data.
> > > >
> > > > Thank you for sharing this opinion, Dmitri! As I previously
> mentioned,
> > I
> > > > can see merits of both ways so I don't have a hard opinion or leaning
> > on
> > > > this topic. I'm happy to tally the votes on this thread instead :)
> > > >
> > > > > We may want to consider delivering events for Storage-side
> operations
> > > > performed by Polaris (e.g. when it writes a new file). These events
> may
> > > be
> > > > useful for driving async clean-up workflows. However, from my
> > > perspective,
> > > > these events would be distinct from the "catalog" events that
> currently
> > > > exist in Polaris.
> > > >
> > > > This is a good idea IMO. We should totally work on this!
> > > >
> > > > Best,
> > > > Adnan Hemani
> > > >
> > > > On Thu, Nov 6, 2025 at 11:29 AM Dmitri Bourlatchkov <
> [email protected]>
> > > > wrote:
> > > >
> > > > > It looks like this discussion is going in the direction where we
> need
> > > to
> > > > > define the meaning of "before" / "after" events and expectations
> for
> > > > their
> > > > > listener callbacks.
> > > > >
> > > > > Actually the latter point probably affects the former. If events
> are
> > > > > delivered only for committed data, I suppose the "before" / "after"
> > > > > distinction will become irrelevant.
> > > > >
> > > > > From my POV, Polaris should deliver catalog-level events only for
> > > > committed
> > > > > data.
> > > > >
> > > > > Additionally, we can probably be lax about always delivering
> events.
> > > That
> > > > > is to say that events for some changes may be lost. Note that,
> events
> > > may
> > > > > be lost even with the current state of the codebase and I generally
> > do
> > > > not
> > > > > think Polaris should venture into the "exactly once" delivery
> > > territory.
> > > > >
> > > > > We may want to consider delivering events for Storage-side
> operations
> > > > > performed by Polaris (e.g. when it writes a new file). These events
> > may
> > > > be
> > > > > useful for driving async clean-up workflows. However, from my
> > > > perspective,
> > > > > these events would be distinct from the "catalog" events that
> > currently
> > > > > exist in Polaris.
> > > > >
> > > > > WDYT?
> > > > >
> > > > > Cheers,
> > > > > Dmitri.
> > > > >
> > > > > On Wed, Nov 5, 2025 at 8:44 PM Adnan Hemani
> > > > > <[email protected]> wrote:
> > > > >
> > > > > > Eric's suggestion also makes sense to me from the perspective of
> > how
> > > > > > Events behave today - and it may actually be a lot easier to
> > > implement,
> > > > > > including adding a third event for when the transaction is
> > committed.
> > > > > >
> > > > > > I guess the question really comes down to: will it be confusing
> for
> > > > users
> > > > > > to see "AfterCommittedTableEvent" (with a parameter of "PENDING")
> > > even
> > > > > > though the transaction was rolled back? I can kind of see both
> > sides
> > > > > > here...would like to hear the rest of the community's thoughts on
> > > this.
> > > > > >
> > > > > > Best,
> > > > > > Adnan Hemani
> > > > > >
> > > > > > On Wed, Nov 5, 2025 at 3:27 PM Eric Maynard <
> > > [email protected]>
> > > > > > wrote:
> > > > > >
> > > > > > > Option (2) is definitely more in line with the original design
> > for
> > > > > events
> > > > > > > -- we could add a third event when the transaction is
> committed,
> > > and
> > > > > then
> > > > > > > the onus is on either the listener impl or the consumer of the
> > > > listener
> > > > > > > impl's output (whatever it may be) to stitch together the
> events
> > > > based
> > > > > on
> > > > > > > the transaction ID or something else.
> > > > > > >
> > > > > > > --EM
> > > > > > >
> > > > > > > On Wed, Nov 5, 2025 at 2:54 PM Dmitri Bourlatchkov <
> > > [email protected]
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Adnan,
> > > > > > > >
> > > > > > > > I agree that this is a bug. Here's the GH issue:
> > > > > > > > https://github.com/apache/polaris/issues/2990
> > > > > > > >
> > > > > > > > I personally do _not_ think the onus for fixing this is on
> you
> > :)
> > > > The
> > > > > > > code
> > > > > > > > pattern that led to this issue has existed in Apache Polaris
> > code
> > > > > since
> > > > > > > day
> > > > > > > > 1... but if you want to give it a try, please feel free.
> > > > > > > >
> > > > > > > > Option 1 from Yufei's email sounds like the right general
> > > approach
> > > > to
> > > > > > > > fixing this problem. I also agree that actual implementation
> > may
> > > be
> > > > > > > tricky.
> > > > > > > > It may be worth discussing an outline of a fix in this thread
> > > > before
> > > > > > > > starting a PR just to avoid rework in case there's no
> consensus
> > > on
> > > > > fix
> > > > > > > > details. I'm specifically concerned about the APIs used in
> this
> > > > case
> > > > > > and
> > > > > > > > how Iceberg catalog transactions map to atomic Persistence
> > > > operations
> > > > > > > (cf.
> > > > > > > > [1]).
> > > > > > > >
> > > > > > > > [1]
> > > > https://lists.apache.org/thread/rf5orxs815zs4h64p4rwp03q3pbgxb5r
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Dmitri.
> > > > > > > >
> > > > > > > > On Wed, Nov 5, 2025 at 4:57 PM Adnan Hemani
> > > > > > > > <[email protected]> wrote:
> > > > > > > >
> > > > > > > > > Hi Dmitri,
> > > > > > > > >
> > > > > > > > > Yes, this is indeed a bug and we should create a GitHub
> issue
> > > for
> > > > > > this
> > > > > > > to
> > > > > > > > > track fixing it. Ideally, we should be doing something
> > similar
> > > to
> > > > > > > Option
> > > > > > > > 1
> > > > > > > > > of what Yufei suggested - but how this will be done may be
> > > harder
> > > > > > than
> > > > > > > it
> > > > > > > > > looks. I can investigate this and I encourage others to
> look
> > > into
> > > > > it
> > > > > > as
> > > > > > > > > well, if this is of interest.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Adnan Hemani
> > > > > > > > >
> > > > > > > > > On Wed, Nov 5, 2025 at 8:29 AM Yufei Gu <
> > [email protected]>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Dmitri,
> > > > > > > > > >
> > > > > > > > > > Good catch! This is a subtle but important issue. Thanks
> > for
> > > > > > raising
> > > > > > > > it.
> > > > > > > > > I
> > > > > > > > > > think we could handle it in a few ways:
> > > > > > > > > >
> > > > > > > > > > 1. Hold the event emission until the all multi-table
> commit
> > > > > > succeeds,
> > > > > > > > and
> > > > > > > > > > buffering per-table events until persistence confirms
> > > success.
> > > > > > > > > > 2. Include a transaction ID and status (e.g., pending,
> > > > committed,
> > > > > > > > > aborted)
> > > > > > > > > > in emitted events so consumers can filter accordingly.
> This
> > > > will
> > > > > > add
> > > > > > > > > burden
> > > > > > > > > > to downstreams. I think we could figure out a way to
> filter
> > > out
> > > > > > while
> > > > > > > > > > persisting them, so that the real consumers won't see the
> > > > events
> > > > > > with
> > > > > > > > > > aborted status.
> > > > > > > > > >
> > > > > > > > > > I'm not sure which way is better at this moment, we need
> to
> > > > take
> > > > > a
> > > > > > > deep
> > > > > > > > > > look to evaluate both.
> > > > > > > > > >
> > > > > > > > > > Yufei
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Nov 5, 2025 at 8:06 AM Dmitri Bourlatchkov <
> > > > > > [email protected]
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi All,
> > > > > > > > > > >
> > > > > > > > > > > I'd like to highlight an aspect of the current Events
> > > > behaviour
> > > > > > > with
> > > > > > > > > > > respect to multi-table commits that Christopher and I
> > > > > discovered
> > > > > > > > today.
> > > > > > > > > > The
> > > > > > > > > > > issue existed since the first Events implementation,
> > AFAIK,
> > > > it
> > > > > > just
> > > > > > > > did
> > > > > > > > > > not
> > > > > > > > > > > come into focus until now, AFAIK.
> > > > > > > > > > >
> > > > > > > > > > > Consider a multi-table commit request [1].
> > > > > IcebergCatalogHandler
> > > > > > > will
> > > > > > > > > run
> > > > > > > > > > > each individual table through a separate commit
> operation
> > > on
> > > > > the
> > > > > > > base
> > > > > > > > > > > catalog [2]. The base catalog will issue events for
> each
> > > > table
> > > > > > > > > separately
> > > > > > > > > > > [3][4]. However, the overall commit to Polaris
> > Persistence
> > > > may
> > > > > > > still
> > > > > > > > > > fail,
> > > > > > > > > > > e.g. due to concurrent updates [5].
> > > > > > > > > > >
> > > > > > > > > > > Now, we can have a situation when both before/after
> > events
> > > > are
> > > > > > > > > delivered
> > > > > > > > > > > for a table, but the actual change that triggered the
> > > events
> > > > is
> > > > > > > _not_
> > > > > > > > > > > persisted, therefore does not exist in the current
> state
> > of
> > > > the
> > > > > > > > Polaris
> > > > > > > > > > > catalog.
> > > > > > > > > > >
> > > > > > > > > > > Thoughts?
> > > > > > > > > > >
> > > > > > > > > > > [1]
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/polaris/blob/f934443114251f85d18c9a51ed61fc49a500a61a/runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java#L973
> > > > > > > > > > >
> > > > > > > > > > > [2]
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/polaris/blob/f934443114251f85d18c9a51ed61fc49a500a61a/runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java#L1051
> > > > > > > > > > >
> > > > > > > > > > > [3]
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/polaris/blob/f934443114251f85d18c9a51ed61fc49a500a61a/runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalog.java#L1405
> > > > > > > > > > >
> > > > > > > > > > > [4]
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/polaris/blob/f934443114251f85d18c9a51ed61fc49a500a61a/runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalog.java#L1558
> > > > > > > > > > >
> > > > > > > > > > > [5]
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/polaris/blob/f934443114251f85d18c9a51ed61fc49a500a61a/runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java#L1058
> > > > > > > > > > >
> > > > > > > > > > > Cheers,
> > > > > > > > > > > Dmitri.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to