> filtering all the reads to find the valuable changes is a > headache I would guess most service owners would rather avoid.
The Iceberg Events API would let users specify filters, which I think would help filter out read events quite easily (with a single filter predicate). But once we start putting that functionality in place, we can revisit this and make sure this is the case. Best, Adnan Hemani > On May 22, 2025, at 8:33 AM, Michael Collado <collado.m...@gmail.com> wrote: > > I was going to make the same suggestion :) For deployments where the > complexity of an extra buffer doesn’t make sense, a blocking persistence > call will make things easier for the services owner. If read events are > added, a high throughput service will need the buffering probably. Though , > I would be surprised if those events end up being served to callers for the > /events API. For read events, I think something like OpenLineage event > reporting is a better option and most callers will want to know about > modifications, so filtering all the reads to find the valuable changes is a > headache I would guess most service owners would rather avoid. > > Mike > > On Thu, May 22, 2025 at 12:27 AM Adnan Hemani > <adnan.hem...@snowflake.com.invalid > <mailto:adnan.hem...@snowflake.com.invalid>> wrote: > >> I haven’t thought of this in depth honestly - but I could see this being >> the case. >> >> -Adnan >> >>> On May 21, 2025, at 4:39 PM, Eric Maynard <eric.w.mayn...@gmail.com> >> wrote: >>> >>> -devlist >>> >>> If we design by interface properly, it should be relatively easy to offer >>> both a disk buffering and an always-write implementation right? >>> >>> On Thu, May 22, 2025 at 12:12 AM Adnan Hemani >>> <adnan.hem...@snowflake.com.invalid> wrote: >>> >>>> Hi all, >>>> >>>> Thanks for sharing these thoughts. I’m also not completely sure about >> how >>>> much we should care about how much slower things will be if we just >> make a >>>> trip to the persistence on every write action. However, I’m building >> this >>>> feature with the intention of being able to also support read event >> types >>>> in the near future, if this is something that the customer is >> interested in >>>> enabling using the `CustomOperation` type that is defined in the Events >> API >>>> spec. Of course, this would need to be configured by the administrator, >> as >>>> maintenance of the persistence is their responsibility. >>>> >>>> Given that the Iceberg Events API spec has not yet merged and can still >>>> see some changes, I’m planning to begin work on the disk buffering now >> and >>>> wait for the Events API to finalize before working on the API side of >> the >>>> end-to-end implementation. >>>> >>>> Best, >>>> Adnan Hemani >>>> >>>>> On May 20, 2025, at 10:30 PM, Michael Collado <collado.m...@gmail.com> >>>> wrote: >>>>> >>>>> That’s super interesting. Glad this is being worked on. Personally, I >>>> don’t >>>>> know that the latency for writing events to a persistent storage is >>>> really >>>>> all that concerning. Looking at the enum of supported operations, only >>>>> write operations seem to trigger the event. It’s not like every read >>>>> request issues a new event. Given that the request latency here is >>>>> dominated by cloud storage calls, do we really care about one extra >> call >>>> to >>>>> Postgres? Personally, I’d skip the extra complexity of a buffer of any >>>> kind >>>>> and just write straight to the persistence store. >>>>> >>>>> Mike >>>>> >>>>> On Tue, May 20, 2025 at 9:31 AM Yufei Gu <flyrain...@gmail.com >> <mailto: >>>> flyrain...@gmail.com>> wrote: >>>>> >>>>>> Looks awesome. Thanks for taking the lead! It makes sense to use a >>>>>> JDBC-backed persistence layer, shared or separate. The optional >>>> retention >>>>>> period is a nice safeguard. >>>>>> I don’t see any blockers on my side. If no one raises major concerns >>>> this >>>>>> week, please go ahead and start the implementation. Exciting to see >> this >>>>>> coming together! >>>>>> >>>>>> Yufei >>>>>> >>>>>> >>>>>> On Tue, May 13, 2025 at 6:37 PM Adnan Hemani >>>>>> <adnan.hem...@snowflake.com.invalid> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I am raising a proposal to implement the proposed Iceberg REST >>>>>>> specification for the Events API (doc < >>>>>>> >>>>>> >>>> >> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://www.google.com/url?q%253Dhttps://docs.google.com/document/d/1WtIsNGVX75-_MsQIOJhXLAWg6IbplV4-DkLllQEiFT8/edit?pli%25253D1%252526tab%25253Dt.0%2526source%253Dgmail-imap%2526ust%253D1748410327000000%2526usg%253DAOvVaw0UNQYKbXoQ2YHVM7J0kB3l%26source%3Dgmail-imap%26ust%3D1748475613000000%26usg%3DAOvVaw38VwFquueiVwCdlo_xR2rB&source=gmail-imap&ust=1748532906000000&usg=AOvVaw2oeKLtSdg6y9EOJaXM4iZA >>>>>>> , >>>>>>> GH < >>>> >> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://www.google.com/url?q%253Dhttps://github.com/apache/iceberg/pull/12584/files%2526source%253Dgmail-imap%2526ust%253D1748410327000000%2526usg%253DAOvVaw1AvwLK402voAm_j6zy25Mn%26source%3Dgmail-imap%26ust%3D1748475613000000%26usg%3DAOvVaw2IwxUir8jCegcKC47zv1Si&source=gmail-imap&ust=1748532906000000&usg=AOvVaw2cZJKVs35WQm3i7TgmeRqj >>> ). >>>> It is my >>>>>>> understanding that this proposal is close and that we will be >> required >>>> to >>>>>>> implement something very close to the current proposal in the near >>>>>> future. >>>>>>> >>>>>>> If Polaris is to implement this API, it will likely need to be >> through >>>> a >>>>>>> Persistence instance that the Polaris server can query instantly, as >>>> this >>>>>>> API will not be asynchronous. Please note, this proposal is not to >>>>>> comment >>>>>>> on what events we may emit today or in the future - the scope of this >>>>>>> proposal is solely to discuss how we plan to implement the proposed >>>>>> Events >>>>>>> API. >>>>>>> >>>>>>> Changes to be made: >>>>>>> >>>>>>> Implement Event storage through the Polaris Persistence layer >>>>>>> >>>>>>> We will store events in a persistence instance of user’s choice - >>>> whether >>>>>>> they would like the events to be part of the same persistence >> instance >>>> as >>>>>>> their Polaris metadata or if they would like for a separate >> persistence >>>>>>> instance. Users will provide the persistence instance by configuring >> a >>>>>> JDBC >>>>>>> string on Polaris startup, similarly to the JDBC string we receive >>>>>>> currently from users for the Polaris metadata. >>>>>>> >>>>>>> For concerns regarding scaling of events in the Polaris persistence >>>>>> layer, >>>>>>> we can also implement a recommended, optional parameter for an events >>>>>>> retention period after which Polaris will asynchronously delete >> records >>>>>>> older than that time period. >>>>>>> >>>>>>> How to Implement Writes to the Polaris Persistence layer >>>>>>> >>>>>>> The way to implement the above proposal would be through >> implementation >>>>>> of >>>>>>> the `PolarisEventListener` < >>>>>>> >>>>>> >>>> >> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://www.google.com/url?q%253Dhttps://github.com/apache/polaris/blob/main/service/common/src/main/java/org/apache/polaris/service/events/PolarisEventListener.java%2526source%253Dgmail-imap%2526ust%253D1748410327000000%2526usg%253DAOvVaw0Z-SY-d50YHPNK38KxhHVk%26source%3Dgmail-imap%26ust%3D1748475613000000%26usg%3DAOvVaw0AVoXWSZgL3qK8GPbwYG9G&source=gmail-imap&ust=1748532906000000&usg=AOvVaw1SOY8j3eq7jubWt1n6iaml >>>>>>> >>>>>>> abstract class. In this implementation, I believe it should not be >>>>>>> controversial to state that we cannot block on events to be flushed >> to >>>>>>> persistence due to latency concerns - and as a result, we have two >>>>>> options: >>>>>>> 1) a simple in-memory buffer or 2) a file-based buffer. Both buffers >>>>>> would >>>>>>> flush after a certain amount of time after the first non-flushed >> event >>>> is >>>>>>> written. While option 2 offers a better event durability guarantee in >>>>>> case >>>>>>> of disaster recovery, it will come at the cost of additional latency >> to >>>>>>> write to the filesystem. If there are no security concerns regarding >>>>>>> writing to the filesystem, I believe this is the recommended way to >>>>>>> implement - the additional latency to write to filesystem should not >>>> add >>>>>>> unreasonable overhead given the right implementation with open >>>>>> filewriters. >>>>>>> If writing to the filesystem is not recommended, I’m not sure there >> is >>>>>> any >>>>>>> other way to achieve guaranteed event durability. In both options we >>>> can >>>>>>> only achieve eventual consistency - to get strong consistency, we >> will >>>>>> need >>>>>>> to implement a way to block the API call until we flush the events to >>>>>>> persistence, which I cannot recommend at this time due to latency >>>>>> concerns. >>>>>>> >>>>>>> Please reply to this thread if there are any questions and/or >> concerns >>>> on >>>>>>> this proposal. If there are no major concerns within a week, then I >>>> will >>>>>>> begin implementation. >>>>>>> >>>>>>> Best, >>>>>>> Adnan Hemani