Hi all,

> I propose the following (building on Alex's proposal) to move this 
> conversation forward: the new method signature would be 
> `Map<PolarisEvent.EventPropertyType, Object> attributes()`

I agree about the potential benefit of strongly-typed attribute keys.
While I initially suggested String for simplicity, I'm actually
leaning towards an AttributeKey approach, similar to Netty [1]. The
concern with using an enum is that it might restrict users from
defining their own custom attributes. But that's more an
implementation detail.

> All other events that only generate an "after" metadata object should store 
> their metadata in "metadataAfter" and leave "metadataBefore" as unset, just 
> like any other unused property.

I have no issues with that logic.

(But I am surprised by the current design where "before" state
information is included in "after" events, and "after" state
information is included in "before" events. Given the substantial size
of objects like TableMetadata, this dual inclusion looks redundant. It
should be possible instead to correlate the before event with its
after counterpart and build a before/after diff of the change, if
desired. But that's a different topic.)

Thanks,
Alex

[1]: 
https://github.com/netty/netty/blob/fc0d763ca983c8290d087ed2887f112963d812d2/common/src/main/java/io/netty/util/AttributeKey.java#L25

On Fri, Nov 14, 2025 at 6:18 PM Adnan Hemani
<[email protected]> wrote:
>
> Hi all,
>
> Very sorry for the late reply - this week has been busy. I was (still
> somewhat am) in favor of strongly-typed events. I had earlier informed my
> opinion on this given other systems which do use their events later within
> their execution. It seems we do not have this use case yet - and not on the
> near horizon yet either, as Dmitri has noted.
>
> However, my one remaining concern with keeping PolarisEvents as a flattened
> "bag of properties" is, unless we have comprehensive per-event testing
> (which defeats the whole point of removing the strongly-typed events
> structure), we may be vulnerable to typos and inconsistent naming, which
> could effectively render the unified filtering/pruning mechanisms useless.
> As a result, I propose the following (building on Alex's proposal) to move
> this conversation forward: the new method signature would be
> `Map<PolarisEvent.EventPropertyType, Object> attributes()` where
> EventPropertyType is an enum defined within PolarisEvent and contains all
> the different types of properties an event could have.
>
> Edge case call-out: There will be special care needed for events such as
> (Before/After)CommitTableEvent, which have metadata objects for before AND
> after - but these can be modeled using two separate EventPropertyType
> objects: one for metadataBefore and one for metadataAfter. All other events
> that only generate an "after" metadata object should store their metadata
> in "metadataAfter" and leave "metadataBefore" as unset, just like any other
> unused property. This may slightly complicate the unified filtering/pruning
> logic - but this, IMO, is an acceptable balance.
>
> WDYT?
>
> Best,
> Adnan Hemani
>
> On Fri, Nov 14, 2025 at 1:48 AM Oleg Soloviov <[email protected]> wrote:
>
> > Hi all,
> >
> > It looks like we have a lazy consensus on this proposal. If that's the case
> > and there are no further objections, I would like to work on this one.
> >
> > Thanks,
> > Oleg
> >
> > On Sat, Nov 8, 2025 at 12:13 AM Dmitri Bourlatchkov <[email protected]>
> > wrote:
> >
> > > Hi Alex,
> > >
> > > I agree that using a flat (single class?) type hierarchy for events on
> > the
> > > server side is reasonable. Polaris Server itself does not appear to
> > "read"
> > > the events it produces, so maintaining the multitude of getters does seem
> > > like an unnecessary overhead. At the same time producing well-structured
> > > payloads for delivering events to external systems (including persistence
> > > in the Polaris database) can be achieved without a verbose type
> > hierarchy.
> > >
> > > Cheers,
> > > Dmitri.
> > >
> > > On Fri, Nov 7, 2025 at 11:30 AM Alexandre Dutra <[email protected]>
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I'm writing to express my concerns about the current state of the
> > > > PolarisEvent API and to propose a solution.
> > > >
> > > > Current challenges:
> > > >
> > > > 1) Excessive complexity: the PolarisEvent interface currently has over
> > > > 150 concrete subtypes, with a corresponding number of methods in the
> > > > PolarisEventListener interface. This forces each concrete listener to
> > > > implement all 150+ methods, even when the logic is similar or
> > > > identical, leading to significant boilerplate (see example [1] from a
> > > > recent PR).
> > > >
> > > > 2) Manual processes: afaik the current plan for event pruning (e.g.,
> > > > removing sensitive or large data) is to implement this event by event.
> > > > This has been a slow process so far. We only have 2-3 events
> > > > implemented, we still have 147 more to go.
> > > >
> > > > While I generally advocate for strongly typed APIs, I believe that in
> > > > this specific context, the PolarisEvent hierarchy is slowing down the
> > > > development of event-related features.
> > > >
> > > > Do we need so many subtypes? Events are very short-lived objects; they
> > > > are created, immediately passed to a listener, and then
> > > > garbage-collected. Besides, most listeners will likely apply the same
> > > > logic to all events (basically: serialize and dispatch). This hints at
> > > > a type hierarchy that isn't being useful to its main consumers.
> > > >
> > > > My proposal is to completely flatten the PolarisEvent hierarchy.
> > > > Instead of numerous concrete types, we would have a single
> > > > implementation. This implementation would expose the methods I'm
> > > > adding in [2], including type() which allows distinguishing events by
> > > > type ID.
> > > >
> > > > It would also expose a new method: Map<String, Object> attributes().
> > > >
> > > > An event factory would be responsible for creating events and
> > > > populating these attributes using a common set of well-defined, typed
> > > > attribute keys such as "catalog_name", "table_identifier",
> > > > "table_metadata", etc.
> > > >
> > > > This creates a schemaless-ish view of the event, which is ideal for
> > > > pruning and serialization. It would enable us to apply common rules
> > > > more efficiently. For example:
> > > >
> > > > 1) All events containing the "table_metdata" attribute could
> > > > automatically apply a pruning logic to reduce its size.
> > > >
> > > > 2) All events containing a specific attribute could automatically have
> > > > sensitive data removed from its value.
> > > >
> > > > I'm curious to hear what the community thinks of this proposal.
> > > >
> > > > Thanks,
> > > > Alex
> > > >
> > > > [1]:
> > > >
> > >
> > https://github.com/vchag/polaris/blob/4c0aef587e63d5e60d657561a0a53701417f324b/runtime/service/src/main/java/org/apache/polaris/service/events/listeners/AllEventsForwardingListener.java
> > > > [2]: https://github.com/apache/polaris/pull/2998
> > > >
> > >
> >

Reply via email to