> filtering all the reads to find the valuable changes is a
> headache I would guess most service owners would rather avoid.

The Iceberg Events API would let users specify filters, which I think would 
help filter out read events quite easily (with a single filter predicate). But 
once we start putting that functionality in place, we can revisit this and make 
sure this is the case.

Best,
Adnan Hemani

> On May 22, 2025, at 8:33 AM, Michael Collado <collado.m...@gmail.com> wrote:
> 
> I was going to make the same suggestion :) For deployments where the
> complexity of an extra buffer doesn’t make sense, a blocking persistence
> call will make things easier for the services owner. If read events are
> added, a high throughput service will need the buffering probably. Though ,
> I would be surprised if those events end up being served to callers for the
> /events API.  For read events, I think something like OpenLineage event
> reporting is a better option and most callers will want to know about
> modifications, so filtering all the reads to find the valuable changes is a
> headache I would guess most service owners would rather avoid.
> 
> Mike
> 
> On Thu, May 22, 2025 at 12:27 AM Adnan Hemani
> <adnan.hem...@snowflake.com.invalid 
> <mailto:adnan.hem...@snowflake.com.invalid>> wrote:
> 
>> I haven’t thought of this in depth honestly - but I could see this being
>> the case.
>> 
>> -Adnan
>> 
>>> On May 21, 2025, at 4:39 PM, Eric Maynard <eric.w.mayn...@gmail.com>
>> wrote:
>>> 
>>> -devlist
>>> 
>>> If we design by interface properly, it should be relatively easy to offer
>>> both a disk buffering and an always-write implementation right?
>>> 
>>> On Thu, May 22, 2025 at 12:12 AM Adnan Hemani
>>> <adnan.hem...@snowflake.com.invalid> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> Thanks for sharing these thoughts. I’m also not completely sure about
>> how
>>>> much we should care about how much slower things will be if we just
>> make a
>>>> trip to the persistence on every write action. However, I’m building
>> this
>>>> feature with the intention of being able to also support read event
>> types
>>>> in the near future, if this is something that the customer is
>> interested in
>>>> enabling using the `CustomOperation` type that is defined in the Events
>> API
>>>> spec. Of course, this would need to be configured by the administrator,
>> as
>>>> maintenance of the persistence is their responsibility.
>>>> 
>>>> Given that the Iceberg Events API spec has not yet merged and can still
>>>> see some changes, I’m planning to begin work on the disk buffering now
>> and
>>>> wait for the Events API to finalize before working on the API side of
>> the
>>>> end-to-end implementation.
>>>> 
>>>> Best,
>>>> Adnan Hemani
>>>> 
>>>>> On May 20, 2025, at 10:30 PM, Michael Collado <collado.m...@gmail.com>
>>>> wrote:
>>>>> 
>>>>> That’s super interesting. Glad this is being worked on. Personally, I
>>>> don’t
>>>>> know that the latency for writing events to a persistent storage is
>>>> really
>>>>> all that concerning. Looking at the enum of supported operations, only
>>>>> write operations seem to trigger the event. It’s not like every read
>>>>> request issues a new event. Given that the request latency here is
>>>>> dominated by cloud storage calls, do we really care about one extra
>> call
>>>> to
>>>>> Postgres? Personally, I’d skip the extra complexity of a buffer of any
>>>> kind
>>>>> and just write straight to the persistence store.
>>>>> 
>>>>> Mike
>>>>> 
>>>>> On Tue, May 20, 2025 at 9:31 AM Yufei Gu <flyrain...@gmail.com
>> <mailto:
>>>> flyrain...@gmail.com>> wrote:
>>>>> 
>>>>>> Looks awesome. Thanks for taking the lead! It makes sense to use a
>>>>>> JDBC-backed persistence layer, shared or separate. The optional
>>>> retention
>>>>>> period is a nice safeguard.
>>>>>> I don’t see any blockers on my side. If no one raises major concerns
>>>> this
>>>>>> week, please go ahead and start the implementation. Exciting to see
>> this
>>>>>> coming together!
>>>>>> 
>>>>>> Yufei
>>>>>> 
>>>>>> 
>>>>>> On Tue, May 13, 2025 at 6:37 PM Adnan Hemani
>>>>>> <adnan.hem...@snowflake.com.invalid> wrote:
>>>>>> 
>>>>>>> Hi all,
>>>>>>> 
>>>>>>> I am raising a proposal to implement the proposed Iceberg REST
>>>>>>> specification for the Events API (doc <
>>>>>>> 
>>>>>> 
>>>> 
>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://www.google.com/url?q%253Dhttps://docs.google.com/document/d/1WtIsNGVX75-_MsQIOJhXLAWg6IbplV4-DkLllQEiFT8/edit?pli%25253D1%252526tab%25253Dt.0%2526source%253Dgmail-imap%2526ust%253D1748410327000000%2526usg%253DAOvVaw0UNQYKbXoQ2YHVM7J0kB3l%26source%3Dgmail-imap%26ust%3D1748475613000000%26usg%3DAOvVaw38VwFquueiVwCdlo_xR2rB&source=gmail-imap&ust=1748532906000000&usg=AOvVaw2oeKLtSdg6y9EOJaXM4iZA
>>>>>>> ,
>>>>>>> GH <
>>>> 
>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://www.google.com/url?q%253Dhttps://github.com/apache/iceberg/pull/12584/files%2526source%253Dgmail-imap%2526ust%253D1748410327000000%2526usg%253DAOvVaw1AvwLK402voAm_j6zy25Mn%26source%3Dgmail-imap%26ust%3D1748475613000000%26usg%3DAOvVaw2IwxUir8jCegcKC47zv1Si&source=gmail-imap&ust=1748532906000000&usg=AOvVaw2cZJKVs35WQm3i7TgmeRqj
>>> ).
>>>> It is my
>>>>>>> understanding that this proposal is close and that we will be
>> required
>>>> to
>>>>>>> implement something very close to the current proposal in the near
>>>>>> future.
>>>>>>> 
>>>>>>> If Polaris is to implement this API, it will likely need to be
>> through
>>>> a
>>>>>>> Persistence instance that the Polaris server can query instantly, as
>>>> this
>>>>>>> API will not be asynchronous. Please note, this proposal is not to
>>>>>> comment
>>>>>>> on what events we may emit today or in the future - the scope of this
>>>>>>> proposal is solely to discuss how we plan to implement the proposed
>>>>>> Events
>>>>>>> API.
>>>>>>> 
>>>>>>> Changes to be made:
>>>>>>> 
>>>>>>> Implement Event storage through the Polaris Persistence layer
>>>>>>> 
>>>>>>> We will store events in a persistence instance of user’s choice -
>>>> whether
>>>>>>> they would like the events to be part of the same persistence
>> instance
>>>> as
>>>>>>> their Polaris metadata or if they would like for a separate
>> persistence
>>>>>>> instance. Users will provide the persistence instance by configuring
>> a
>>>>>> JDBC
>>>>>>> string on Polaris startup, similarly to the JDBC string we receive
>>>>>>> currently from users for the Polaris metadata.
>>>>>>> 
>>>>>>> For concerns regarding scaling of events in the Polaris persistence
>>>>>> layer,
>>>>>>> we can also implement a recommended, optional parameter for an events
>>>>>>> retention period after which Polaris will asynchronously delete
>> records
>>>>>>> older than that time period.
>>>>>>> 
>>>>>>> How to Implement Writes to the Polaris Persistence layer
>>>>>>> 
>>>>>>> The way to implement the above proposal would be through
>> implementation
>>>>>> of
>>>>>>> the `PolarisEventListener` <
>>>>>>> 
>>>>>> 
>>>> 
>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://www.google.com/url?q%253Dhttps://github.com/apache/polaris/blob/main/service/common/src/main/java/org/apache/polaris/service/events/PolarisEventListener.java%2526source%253Dgmail-imap%2526ust%253D1748410327000000%2526usg%253DAOvVaw0Z-SY-d50YHPNK38KxhHVk%26source%3Dgmail-imap%26ust%3D1748475613000000%26usg%3DAOvVaw0AVoXWSZgL3qK8GPbwYG9G&source=gmail-imap&ust=1748532906000000&usg=AOvVaw1SOY8j3eq7jubWt1n6iaml
>>>>>>> 
>>>>>>> abstract class. In this implementation, I believe it should not be
>>>>>>> controversial to state that we cannot block on events to be flushed
>> to
>>>>>>> persistence due to latency concerns - and as a result, we have two
>>>>>> options:
>>>>>>> 1) a simple in-memory buffer or 2) a file-based buffer. Both buffers
>>>>>> would
>>>>>>> flush after a certain amount of time after the first non-flushed
>> event
>>>> is
>>>>>>> written. While option 2 offers a better event durability guarantee in
>>>>>> case
>>>>>>> of disaster recovery, it will come at the cost of additional latency
>> to
>>>>>>> write to the filesystem. If there are no security concerns regarding
>>>>>>> writing to the filesystem, I believe this is the recommended way to
>>>>>>> implement - the additional latency to write to filesystem should not
>>>> add
>>>>>>> unreasonable overhead given the right implementation with open
>>>>>> filewriters.
>>>>>>> If writing to the filesystem is not recommended, I’m not sure there
>> is
>>>>>> any
>>>>>>> other way to achieve guaranteed event durability. In both options we
>>>> can
>>>>>>> only achieve eventual consistency - to get strong consistency, we
>> will
>>>>>> need
>>>>>>> to implement a way to block the API call until we flush the events to
>>>>>>> persistence, which I cannot recommend at this time due to latency
>>>>>> concerns.
>>>>>>> 
>>>>>>> Please reply to this thread if there are any questions and/or
>> concerns
>>>> on
>>>>>>> this proposal. If there are no major concerns within a week, then I
>>>> will
>>>>>>> begin implementation.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Adnan Hemani

Reply via email to