Re: Polaris Community Sync on Events

Robert Stupp Tue, 24 Jun 2025 01:32:17 -0700

Apache Polaris has users and not customers.

Generally speaking, pushing commercial interests to an Apache projectwould be very concerning.

Anything that goes into the production code base has to be considered asproduction code and not an example. Asking users to implement somethingdifferent to be able to use a functionality for which an implementationalready exists does not seem right.

Your claim that there were no "reasonable suggestions" is not correct.The option for a much simpler approach was mentioned (by Alex IIRC) awhile ago, but until now not considered. The option to go with just aninterface without an implementation was proposed as well, but also notconsidered.

The questions I mentioned in a previous reply were meant to think aboutthe approaches in the PR to figure out issues about what can go wrong,adding technical statements about practically observed facts in myprevious reply. I do not know what else I could say about these generalissues.



On 24.06.25 01:33, Adnan Hemani wrote:

Thanks for these concrete concerns. The way I see it is that the
file-buffer implementation is again, a customer-choice they can make. If
they are using ephemeral storage or more complicated persistent volumes,
they still have the ability to write their own event
listener implementation and use that instead. This sample implementation
does not lock users into using this approach.

But on a related note, you raised many concerns in your previous email
regarding event durability for an auditing use case. If we are not able to
use a "commit-log" or "file-buffer" architecture due to the concerns in
this email, what is your proposed way to solve all of these concerns from
an out-of-the-box auditing functionality perspective? As Alex suggested on
the PR, going to persistence on every event is technically an option - but
I would assume there are heavy latency and/or reliability concerns there,
especially for read-based events. If there is not a heavy concern with
inserting event logs within the same transaction as a read/write query to
the persistence, then please clearly state that and I would be glad to
implement that instead. Additionally, what types of "necessary guard rails"
are you looking for when it comes to an on-heap buffer?

"Tak[ing] a step back and rethink" is not really action-oriented and
doesn't help move the conversation forward - providing reasonable
suggestions that help resolve these concerns is what truly helps. Please
work with me to propose something that you wouldn't have concerns on.

Best,
Adnan Hemani

On Mon, Jun 23, 2025 at 3:15 AM Robert Stupp <sn...@snazy.de> wrote:

No need to apologize, but I think there are some quite important aspects
that have to be considered. Going into every explicit and implicit
detail is way too much for a single email, as it requires explaining
operating system and file system behaviors.

Since you asked for the "nitty gritty details":

The theoretical assumption that any "file write()" is immediately
persistent and can always be immediately and consistently read is just
wrong. Files do get truncated, files can have garbage content ...
nothing of that is handled. The fact that containers use ephemeral
storage means that nothing can be read back. Using persistent volumes is
another beast. And then it goes quite into file system performance and
configuration territory, plus the availability and performance aspects
of PVs. Other aspects like load-shedding, handling of file-system and
disk failures are not handled at all.

With the approach proposed in the PR, users have to understand all that.

Your proposal wrt to big payload to "just return 'null'" would violate
the current Iceberg events proposal - the fields for 'metadata' and
'updates' are mandatory. Truncation is also not an option as it would
falsify/corrupt data.

It's been mentioned on the PR to "take a step back and rethink". Let's
please consider the feedback on the PR.

Polaris today does _not_ require anything to be persisted locally - and
having the clear separation of storage and compute is good. I am
strongly against on adding anything that is a (file-system based)
commit-log. I am also strongly against adding an on-heap buffer w/o
adding the necessary guard rails.

I hope the above helps not causing unnecessarily long review round trips.


On 21.06.25 01:32, Adnan Hemani wrote:

I also think that it is frustrating for reviewers when concerns are not

and

But there are strong and serious objections, not just from me, around
the technical approach. These objections have not been addressed.

I generally agree to these statements regarding frustration from
unresolved concerns/objections, as I also maintain other Open Source
projects - but I’d like to see any comments on any mailing thread or
PRs regarding this proposal where this is the case. I’ll apologize in
advance if I missed anything - but I believe, all objections had been
duly responded to and/or resolved in a prompt manner.

As a project we should strive for solutions that our users can safely
use without having to understand complex details. Adding more knobs
that users must understand before they can use Polaris makes the
adoption of Polaris too difficult. The most successful projects and
products do not have any mandatory configuration options that require
reading and understanding (lots of) documentation to get started,
solely because things are self-explaining and easy to use.

I agree with you here - /mandatory/ configuration options should
/not/ be required for this feature; they always increase the user’s
mental model complexity and should be used as sparingly as possible.
However, I am not introducing any /mandatory/ configuration options
here; configuring the one line of which Event Listener implementation
the user wants already exists. All configuration options regarding the
Event Listener implementations have conservative default values (as I
stated in the previous email) that only “power-users” will want to
tinker with. Providing additional /optional /configurations for users
rarely increase the general user’s mental model complexity as general
users will not need/use these options.

On top: there is the tight relation to the Iceberg proposal having
serious implications to the persistence (writes and queries) of these
events: the size of table/view metadata and updates to those, the
presence of these per-event attributes having a huge payload is
required by the Iceberg proposal as it stands. You raised that
concern as well. There is still no answer to or consensus on that yet.

As I stated in my previous email, I believe we have identified
workarounds for this - and I don’t see that these workarounds are
entirely unreasonable. To be clear, let me list some of these
workarounds out in more details here and we can debate why none of
these workarounds ultimately can work for Polaris:
* Don’t store fields within payloads above a certain size, such as the
major concern of `TableMetadata` objects. Show back `null` to users
who query for this particular event.
* Truncate the specific fields within larger objects, such as the
`snapshot` field within `TableMetadata` objects if the overall object
size exceeds a certain size.

Auditing is a mechanism to (later) inspect what happened, who did
what, etc etc., which means that auditing has strong consistency and
ordering requirements. These requirements can, in general, not be met
with the guarantees mentioned in the Iceberg proposal, and in
particular not with the proposed implementation.

I’d like to peel this back a bit: what do you mean by "ordering
requirements"? As I mentioned in my previous email, ordering
requirements for showing events back to customers is still possible -
but not ordering guarantees for event ingestion. And I don’t see
ordering guarantees on event ingestion as a requirement and have still
not gotten any good reasoning why this should be the case.

For consistency, I believe we have done the best we can with the
proposed implementation for an open-source deployment. I would
actually say it is almost impossible to guarantee SLAs or consistency
for any sort of open-source deployment where the open-source project
does not control all infrastructure underneath or around the
open-source project. It is exactly for this reason why many users go
towards managed deployments of open source projects where other
companies are paid for upkeep to all infrastructure and are on the
hook for providing such guarantees.

But, I do not think this then means that there cannot be a
best-effort, out-of-the-box experience that an open-source project
gives as an optional feature. If you feel that there is some design
decision that I’ve made that has severe ramifications for the
customer, let’s definitely discuss those specifically rather than
general commentary that does not move this proposal forward.

Some questions I find very useful to think about before starting with
a technical approach:
* What happens if the process crashes?
* What happens during network hiccups?
* What happens to the server/disk/database/network/service if there a
(too) many concurrent requests?
* What happens during and after a STW GC phase?
* How do things behave in a horizontally scaled setup?
* How do things behave in a geo-distributed setup?
* How do things recover from x/y/z?
There are more questions, but those are important detailed ones.
Those are not "isolated" questions, but rather relate to each other.
And those are just the lower level ones, not even the higher level
ones like UX, use cases, SLAs, etc.

Thanks for the specific questions here - I’m glad to answer all of
them in the context of the proposed PR that’s been introduced and
specifically with the file-based buffer implementation. The in-memory
buffer implementation obviously has large drawbacks for consistency
but is there to serve users who absolutely will not accept any writing
to the disk but would still like best-effort events.

* What happens if the process crashes?

Events remain in the file buffers and will be flushed to persistence
once Polaris and/or the Persistence recover.

* What happens during network hiccups?
* What happens to the server/disk/database/network/service if there a
(too) many concurrent requests?

File buffers will not delete the buffered events until they have been
successfully flushed to persistence. No events will be lost and will
continue to be retried. In the case of incoming events being
generated, they are simply written onto disk and the optional customer
configurations to dump to persistence after a certain amount of events
will help bail users out in case too many events are generated too
quickly.

* What happens during and after a STW GC phase?

In the current implementation, the same thing that would happen to the
service as a whole - everything pauses. The threads running as part of
the file-buffer event listener will also be paused and will resume
when the JVM allows it to alongside the rest of the service.

* How do things behave in a horizontally scaled setup?
* How do things behave in a geo-distributed setup?

Each buffer event listener operates only on its own Polaris instance.
Each buffer event listener is, as a result, also responsible for only
its own set of buffers.  As long as it can connect to the persistence
instance that it is supposed to communicate (which is a base
requirement for the entire Polaris service in general), there are no
issues in managing a horizontally-scaled or geo-distributed set of
buffers. However, this does give me one good idea - what do we do in
case this particular Polaris instance shuts down? I believe we can add
a shut-down cleanup for these buffers. Please make a comment on the PR
regarding this and I can begin investigation on this.

PS: I can go into the nitty gritty details, but I think it's worth to
consider the above first.

I believe I’ve responded to all the above questions/concerns. Please
do go deeper into the details - I’d prefer as little conversation
latency as possible, so please do list all concerns as thoroughly as
you can. Going through fractional concerns little-by-little will only
make our time to resolve concerns unnecessarily longer.

Best,
Adnan Hemani

On Jun 20, 2025, at 6:01 AM, Robert Stupp <sn...@snazy.de> wrote:

Let me first second your point on frustration about long standing
proposals - I completely feel that pain. I also think that it is
frustrating for reviewers when concerns are not addressed. But it is
also worth noting that getting to a consensus takes time. Getting
something into an OSS project can take a very long time. People have
many things on their plate, not only the "review of my particular
effort". This is very different from "single entity owned" and
"closed source" projects. It is important to keep that in mind.

As a project we should strive for solutions that our users can safely
use without having to understand complex details. Adding more knobs
that users must understand before they can use Polaris makes the
adoption of Polaris too difficult. The most successful projects and
products do not have any mandatory configuration options that require
reading and understanding (lots of) documentation to get started,
solely because things are self-explaining and easy to use.

Nobody objects the effort on having events in Polaris.

But there are strong and serious objections, not just from me, around
the technical approach. These objections have not been addressed.

On top: there is the tight relation to the Iceberg proposal having
serious implications to the persistence (writes and queries) of these
events: the size of table/view metadata and updates to those, the
presence of these per-event attributes having a huge payload is
required by the Iceberg proposal as it stands. You raised that
concern as well. There is still no answer to or consensus on that yet.

Auditing is a mechanism to (later) inspect what happened, who did
what, etc etc., which means that auditing has strong consistency and
ordering requirements. These requirements can, in general, not be met
with the guarantees mentioned in the Iceberg proposal, and in
particular not with the proposed implementation.

Some questions I find very useful to think about before starting with
a technical approach:
* What happens if the process crashes?
* What happens during network hiccups?
* What happens to the server/disk/database/network/service if there a
(too) many concurrent requests?
* What happens during and after a STW GC phase?
* How do things behave in a horizontally scaled setup?
* How do things behave in a geo-distributed setup?
* How do things recover from x/y/z?
There are more questions, but those are important detailed ones.
Those are not "isolated" questions, but rather relate to each other.
And those are just the lower level ones, not even the higher level
ones like UX, use cases, SLAs, etc.

Building technically challenging things is often fun and probably the
reason why we build code. Our users only want things to be "boring"
in the sense of "it just works". Our job is it to make the complex
and complicated things boring.

Robert

PS: I can go into the nitty gritty details, but I think it's worth to
consider the above first.


On 19.06.25 04:39, Adnan Hemani wrote:

First of all, sorry all for the misformating in my previous email -
seems my mail client wasn’t playing well with the Apache mail server.

Adding that there's been consensus in the meeting to start with a
pure Java interface and go from there.

I’m not sure what this means - can you expand on this? I can't agree
there’s a consensus on this unless we’re all clear as to what this
actually means.

I'm not sure that the statement "Ordering guarantees are **only**
possible ... event creation time" (emphasis mine) is correct.

This is in context of the implementation that was shared on the PR.
In that context, there is no guarantee on ordering - and I don’t
agree that there is a good reason why this is a hard requirement
when it is still possible to show results in an ordered manner to

users.

During the meeting I mentioned that I strongly believe that it is
not a good idea to let users (as Apache Polaris we do NOT have
customers) figure out constraints and limitations and issues on
their own.

Your objection is noted; but event listeners are inherently
configurable by users - and this level of “power-user features” is
and always should be configurable by the user themselves. As part of
my implementation, I’ve provided very conservative default values -
but users are free to modify as they require. I don’t think there is
anything wrong against providing reasonable defaults and then
letting users decide based on their knowledge of their usage patterns.

Neither the Iceberg proposal nor PR 1844 are suitable for the
auditing use case, because auditing is a _very_ different beast and
implies way more requirements than "just store some data w/o any
consistency guarantees".

Happy to discuss what other requirements are not being fulfilled
here for auditing use cases. The way I look at it, this
implementation provides the most reasonable, resilient
implementation without creating a crazy amount of infrastructure for
an out-of-the-box experience. An implementation involving
Kafka/message broker to fire-and-forget events (and rely on Kafka to
sink the events properly) would, of course, be much more resilient -
but no longer works out of the box for Apache Polaris. And to
underscore the above point - the choice of which event listener to
use is again on the user. They may choose to use one of the
implementations out of the box - or write one of their own - or not
to enable events/auditing altogether. It should ultimately, always
be, the user's choice.

There are still concerns mentioned in the Iceberg events proposal,
with a huge impact to this effort. So I strongly believe that both
efforts are very tightly coupled and not orthogonal efforts that
could be handled independently. For example, I do not see how the
"metadata payload size concern" is mitigated in PR 1844. We had the
same discussion around "caching Iceberg metadata in Polaris".

This is something I’ve brought up in the Iceberg proposal as well
and would like to see changed as well; but ultimately, this is not a
change that forces a drastic change regardless of the outcome. We
have a variety of workarounds on this particular topic (such as not
storing the larger metadata payloads altogether and showing a null
object to the user - or truncating the larger fields, etc.) I don’t
find making a decision on this as  required for an MVP change - but
if you feel that this is the case, please bring all topics that fit
this criteria and I am happy to discuss and service them as required.

During the meeting some people raised concerns about the
"buffering" that is strictly speaking not a necessity, but also
introduced in 1844. That introduces additional consistency issues,
additional risk of losing events and additional ordering issues.
That is a very different problem than "just storing some data".

I believe this was mostly Alex that brought up this concern; as per
a different mailing thread, we had discussed this
already:

https://www.google.com/url?q=https://lists.apache.org/thread/fqfsy03855rv3mwscol3qnxnf4xcnc3v&source=gmail-imap&ust=1751029395000000&usg=AOvVaw3aogakcQ4pmRcpduveArNI

https://www.google.com/url?q=https://lists.apache.org/thread/fqfsy03855rv3mwscol3qnxnf4xcnc3v&source=gmail-imap&ust=1751029395000000&usg=AOvVaw3aogakcQ4pmRcpduveArNI>.

I still don’t agree that we can write back to the persistence on
read-generated events as well on a per-call basis. But if that’s
what it will take to get this PR merged, I’m happy to remove the
buffer implementations and write back to persistence on every event
generated. What I will not take responsibility for, in that case, is
potential user complaints on latency and DB load once they enable
writing back to the persistence (on either write-only or read-write
event generation). Please let me know how you and Alex would like to
proceed.

To also add context, the above Mailing List thread has been open for
over a month detailing all of this and did not receive these
concerns at any time regarding this. It is immensely frustrating
that contributors follow all the processes recommended - yet still
end up with the possibility of wasted efforts at the 11th hour.

Best,
Adnan Hemani

On Jun 18, 2025, at 3:40 AM, Robert Stupp <sn...@snazy.de> wrote:

Adding that there's been consensus in the meeting to start with a
pure Java interface and go from there.

I'm not sure that the statement "Ordering guarantees are **only**
possible ... event creation time" (emphasis mine) is correct.

During the meeting I mentioned that I strongly believe that it is
not a good idea to let users (as Apache Polaris we do NOT have
customers) figure out constraints and limitations and issues on
their own.

Neither the Iceberg proposal nor PR 1844 are suitable for the
auditing use case, because auditing is a _very_ different beast and
implies way more requirements than "just store some data w/o any
consistency guarantees".

There are still concerns mentioned in the Iceberg events proposal,
with a huge impact to this effort. So I strongly believe that both
efforts are very tightly coupled and not orthogonal efforts that
could be handled independently. For example, I do not see how the
"metadata payload size concern" is mitigated in PR 1844. We had the
same discussion around "caching Iceberg metadata in Polaris".

During the meeting some people raised concerns about the
"buffering" that is strictly speaking not a necessity, but also
introduced in 1844. That introduces additional consistency issues,
additional risk of losing events and additional ordering issues.
That is a very different problem than "just storing some data".


On 17.06.25 21:16, Adnan Hemani wrote:

Hi everyone,

In lieu of a recording of today’s Community Sync on Events, I am
posting some notes regarding what was discussed:
What is the relationship between Iceberg Events API and Polaris
Events, which are proposed
inhttps://

www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/pull/1844%26source%3Dgmail-imap%26ust%3D1750848131000000%26usg%3DAOvVaw37B6mpoGEqTLWUurMAUCzZ&source=gmail-imap&ust=1751029395000000&usg=AOvVaw2BRYRRCSwzwpx8jP9ckdDR?

https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/pull/1844%26source%3Dgmail-imap%26ust%3D1750848131000000%26usg%3DAOvVaw37B6mpoGEqTLWUurMAUCzZ&source=gmail-imap&ust=1751029395000000&usg=AOvVaw2BRYRRCSwzwpx8jP9ckdDR
?>

Persisting Polaris events are a pre-requisite of the Iceberg
Events API - but are not strictly tied to this. Users could find
value in being able to persist the Polaris Events without using
the Iceberg Events API.
What Query Patterns are we expecting?
Going based on the assumption that the Iceberg Events API will be
a primary consumer of the Polaris Events and that it is almost
finalized. The proposed data schema for events is designed to work
efficiently with the current state of the Iceberg Events API.
What’s the Intended Use-Case?
This will go out in a different email later today under the
original proposal thread to ensure all context is in the same
email thread.
If auditing is a potential use-case, then what guarantees are we
able to provide?
Ordering guarantees are only possible in that the event creation
time is listed with the Polaris Event. When querying Polaris
Events from the database, we can always sort events based on this
timestamp.
Durability guarantees can be found in some implementations - but
this is up to the customer to choose which implementation they
choose and how they’d like to configure that implementation. All
of these configurations are present in the PR as it stands today.
A potential Kafka implementation may help with these concerns -
but lacks an end-to-end customer experience within Polaris and may
be pushing the concerns forward to Kafka rather than solving them.
Unsure how this may work with Iceberg Events API in the future.
Can the PR be broken up further?
Yes, it is possible - but unclear what parts are not necessary at
this time. Community to review and make suggestions on the PR.

Next Steps/Action Items:
Community: to review PR as it stands and provide high-level
recommendations/suggestions
Adnan Hemani: Send email regarding intended use cases.
Adnan Hemani: To respond to all reviews on PRs.

Please do respond to this email with anything I may have missed
out on! Thanks to everyone who was able to make it to this
morning’s sync and for everyone’s contributions :)

Best,
Adnan Hemani

On Jun 13, 2025, at 4:43 PM, Adnan Hemani
<adnan.hem...@snowflake.com> wrote:

Hi all,

As we were not able to discuss at the previous community sync,
I’m setting a quick sync early next week to discuss Events in
Persistence
(PR:

https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/pull/1844%26source%3Dgmail-imap%26ust%3D1750848131000000%26usg%3DAOvVaw37B6mpoGEqTLWUurMAUCzZ&source=gmail-imap&ust=1751029395000000&usg=AOvVaw2BRYRRCSwzwpx8jP9ckdDR

https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/pull/1844%26source%3Dgmail-imap%26ust%3D1750848131000000%26usg%3DAOvVaw37B6mpoGEqTLWUurMAUCzZ&source=gmail-imap&ust=1751029395000000&usg=AOvVaw2BRYRRCSwzwpx8jP9ckdDR>).

Everyone is welcome to join and discuss on next steps here. Thanks!

Best,
ADNAN HEMANI

Polaris Community Sync on Events
Tuesday, June 17 · 9:00 – 9:30am
Time zone: America/Los_Angeles
Google Meet joining info
Video call
link:

https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://meet.google.com/ear-kiij-sur%26source%3Dgmail-imap%26ust%3D1750848131000000%26usg%3DAOvVaw1UuJNMFk1OBP1M37cak4Jy&source=gmail-imap&ust=1751029395000000&usg=AOvVaw1-MKRhtyHWHeRQdo6b5RNz

https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://meet.google.com/ear-kiij-sur%26source%3Dgmail-imap%26ust%3D1750848131000000%26usg%3DAOvVaw1UuJNMFk1OBP1M37cak4Jy&source=gmail-imap&ust=1751029395000000&usg=AOvVaw1-MKRhtyHWHeRQdo6b5RNz

Or dial: ‪(US) +1 402-410-2280‬ PIN: ‪350 919 847‬#
More phone
numbers:

https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://tel.meet/ear-kiij-sur?pin%253D5036846369686%26source%3Dgmail-imap%26ust%3D1750848131000000%26usg%3DAOvVaw3C_51S6TE4965AiPLaVe1A&source=gmail-imap&ust=1751029395000000&usg=AOvVaw2UTLLmxN5IXik3klT4a6xc

https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://tel.meet/ear-kiij-sur?pin%253D5036846369686%26source%3Dgmail-imap%26ust%3D1750848131000000%26usg%3DAOvVaw3C_51S6TE4965AiPLaVe1A&source=gmail-imap&ust=1751029395000000&usg=AOvVaw2UTLLmxN5IXik3klT4a6xc

Re: Polaris Community Sync on Events

Reply via email to