Hi All, WDYT about another sync call next week?
Thanks, Dmitri. On Wed, May 6, 2026 at 5:29 PM Dmitri Bourlatchkov <[email protected]> wrote: > Hi EJ, > > Thanks for the summary! It covers what we discussed in the meeting very > well, IMHO. > > Looking forward to concrete PRs :) > > Cheers, > Dmitri. > > On Wed, May 6, 2026 at 5:08 PM EJ Wang <[email protected]> > wrote: > >> Hi folks, >> >> We had a community sync earlier, thanks JB for scheduling it. Notes from >> the first metrics architecture sync (May 6, 10-11am PT). Discussion doc >> with per-section status: >> >> https://docs.google.com/document/d/100h7c4damrUzVuquYbBHM0EvA4LSWuW2IT2dN_7nYVA/edit?tab=t.0 >> >> *The meeting covered both topics from the doc. Direction-level alignment >> was reached on the headline pieces; details remain for PR review or >> follow-up sessions.* >> >> *Topic 1 — Persistence schema redesign* >> Idea-level alignment on consolidating per-type tables >> (scan_metrics_report, >> commit_metrics_report) into a single metrics_report table. The motivating >> cost is the surface area added by every new metric type today: new table, >> SPI method, record class, model, converter, schema migration. >> >> Most schema details are deferred to the schema PR. A few specific points >> came up: >> • metric_schema_version: Yufei prefers dropping it, since there is no >> spec-level concept of metrics versioning today and it is hard to define >> unilaterally. Robert prefers keeping it, given IRC v2 is coming and the >> schema should be considered against its likely shape; Robert also raised >> how to differentiate various payload formats if any. EJ's read is that >> this >> is a two-way-door decision. We can start without the field, and if IRC v2 >> changes the shape we would likely roll a corresponding new schema anyway, >> which is not particularly costly. >> • Payload format: Robert pointed out that future formats beyond JSON may >> be worth supporting. The exact shape is deferred to the schema discussion. >> • Partition strategy: Anand suggested monthly partitioning based on his >> experience as potentially helpful at scale. >> >> *Topic 2 — Where metrics ingestion and storage belong* >> Idea-level alignment that metrics should be a separated SPI from the >> entity >> persistence stack. Two reasons surfaced: (a) workloads and capability >> requirements diverge enough that coupling them creates artificial >> constraints, and (b) admin experience improves when metrics has its own >> bootstrap, retention, and lifecycle. Dmitri noted Polaris being a platform >> should have the flexibility to support different persistence backends per >> concern, and pointed to a concrete next step of separating the JDBC >> bootstrap for metrics from the metastore bootstrap. Robert proposed an >> additional UX extension: detect an unbootstrapped metrics store on first >> use and auto-bootstrap rather than requiring an explicit manual bootstrap >> step. >> The meeting also confirmed that Polaris metrics can start small and stay >> Iceberg-focused. Naming and persistence schema can lean Iceberg-specific. >> If a future expansion to generic-table metrics or operational metrics >> arrives, an abstraction layer can be built on top of the Iceberg metrics >> reporter at that point. Robert remains on the fence and would prefer >> something more generic but did not block the direction; Dmitri's read was >> that the proposed framework already has enough flexibility to absorb >> future >> expansion. >> >> The Trade-offs and Proposed structure sections in the doc were not >> reviewed >> in detail. They remain open for either the next sync or PR review. >> >> *Cross-cutting alignment — battery-included plus pluggable* >> A common philosophy emerged from the discussion. EJ summarized it as: >> Polaris should provide a battery-included UX for beginners and the >> flexibility for advanced users to swap the included battery for something >> more powerful or tailored to their use case. The SPI design needs to >> enable >> both. >> >> The inputs that shaped this framing: >> • Anand described how his team uses the current metrics persistence >> (three metrics consumers in v1.4). >> • Yufei raised Grafana and dashboard integrations as a destination use >> case beyond the default. >> • Robert called out that the current design is more JDBC-focused. >> >> Two concrete instances: >> • Async metrics intake: Yufei's initial position was that async should >> largely live on the producer side and there is not much Polaris can do. >> Robert suggested a Polaris-side default is doable via Vert.x. Dmitri >> agreed >> the direction is worth exploring. The meeting converged on a >> battery-included default (likely Vert.x-backed) with an SPI shape that >> lets >> power users route to a more scalable backend (k8s-hosted queue, AWS SQS, >> etc.). >> • Pluggable destinations: combining Yufei's dashboard use case with >> Robert's JDBC-focused call-out, the meeting agreed the SPI should be >> structured for multiple sinks so integrations become impl choices rather >> than architectural changes. >> >> The battery-included default is most likely to use the existing >> JDBC-backed >> approach. >> >> *Direction (idea-level alignment)* >> • Single metrics_report table consolidating per-type metrics, replacing >> scan_metrics_report and commit_metrics_report >> • Iceberg-focused naming and schema for now, revisit if generic-table or >> operational metrics arrive >> • Metrics persistence as a separated SPI, not on BasePersistence >> • Bootstrap path separated for metrics, independent of metastore >> bootstrap >> • "Battery-included plus pluggable" as the SPI design philosophy >> >> *Open items* >> • Schema details: metric_schema_version, payload format, IRC v2 >> forward-compat shape >> • SPI design details — full review either in the next sync or in the >> corresponding PR >> • Schema refactor PR ownership >> >> *Action items* >> • EJ to take a first stab at the SPI design and potentially partner with >> Anand to incorporate the lessons learned from the existing reporter and >> persistence work. >> • Schema refactor PR ownership is not yet decided. If anyone is >> interested in driving it, reply on this thread. >> • JB to schedule the next sync, tentatively in two weeks. >> >> -ej >> >> On Mon, Apr 27, 2026 at 3:07 PM EJ Wang <[email protected]> >> wrote: >> >> > Thanks Yufei for the +1. >> > >> > JB, could you help add a biweekly metrics architecture sync to the >> Polaris >> > community calendar? I'm thinking Thursdays at 9-10am PT, on the >> off-weeks >> > from the community meeting (starting May 7), 60 minutes. >> > >> > Here's a rough agenda to work through over the first few sessions, >> grouped >> > by priority: >> > >> > *First: foundational direction* >> > >> > 1. MetricsPersistence: public SPI or internal implementation detail? >> > • Marked @Beta, javadoc calls it a "Service Provider Interface", >> but >> > only one consumer (JdbcBasePersistenceImpl), lives on BasePersistence. >> If >> > demoted to a private helper inside a persisting reporter impl, most >> > downstream design decisions become implementation details rather than >> > contract questions. >> > >> > 2. Persistence schema redesign >> > • Current two-table layout (scan_metrics_report, >> > commit_metrics_report) with ~25 flattened columns each. Every new metric >> > type requires a new table, SPI method, record class, model, converter, >> and >> > schema migration. Direction to explore: single table with metric_type >> enum, >> > schema_version, and JSON payload column. >> > >> > *Second: design details once direction is set* >> > >> > 3. Partition key strategy >> > • Single-table design means scan metrics at scale will have high >> > write concurrency per table. Schema needs to expose enough structure for >> > backends to shard by entity or time range. >> > >> > 4. Read/write path consistency >> > • Writes go through PolarisMetricsManager on MetaStoreManager. >> Reads >> > bypass MetaStoreManager and go straight to BasePersistence, excluding >> > non-JDBC backends from the read API. >> > >> > *Third: cleanup and alignment* >> > >> > 5. PolarisMetricsReporter naming >> > • Only handles IRC (ScanReport/CommitReport), doesn't cover generic >> > tables or operational metrics. Name is broader than scope. >> > >> > 6. PolarisMetricsManager facade passthrough >> > • Entire default method is >> callCtx.getMetaStore().writeScanReport(). >> > Zero logic, passes Level 1 straight through to Level 3. Same >> anti-pattern >> > as PolarisEventManager. >> > >> > 7. Iceberg community alignment >> > • Payload-type extension needs discussion on dev@iceberg. >> obelix74's >> > Feb thread got zero replies. Needs a committer voice. >> > >> > Lets confirm prioritization in the first session. >> > >> > -ej >> > >> > On Tue, Apr 21, 2026 at 3:18 PM Yufei Gu <[email protected]> wrote: >> > >> >> Thanks everyone for continuing to drive this forward. I agree that the >> >> problem is getting complex enough that a more structured discussion >> would >> >> help. >> >> >> >> +1 on setting up a biweekly sync for the metrics architecture. I’m >> happy >> >> to >> >> join. >> >> >> >> Yufei >> >> >> >> >> >> On Tue, Apr 21, 2026 at 2:34 PM EJ Wang < >> [email protected]> >> >> wrote: >> >> >> >> > Also, I've been looking more closely at the *persistence schema in >> the >> >> > current metrics work*, and I think there's a structural rigidity >> problem >> >> > worth raising before the shape gets locked in. >> >> > >> >> > Right now we have two separate tables (scan_metrics_report and >> >> > commit_metrics_report), each with ~25 flattened columns that directly >> >> > mirror the Iceberg report fields. The SPI follows the same split: >> >> > writeScanReport and writeCommitReport as separate methods, with >> per-type >> >> > record classes, converters, and model objects. *The practical cost: >> >> > adding a new metric type (operational metrics, for example) requires >> a >> >> new >> >> > table, a new SPI method, a new record class, a new model class, a new >> >> > converter branch, and a schema migration*. That's a lot of surface >> area >> >> > for what should be "one more kind of metric." >> >> > >> >> > *My bias* would be toward a single metrics table with *a typed JSON >> >> > payload*. Something like: metric_type (enum), entity_id, >> >> > table_identifier, snapshot_id (nullable), received_ts, >> schema_version, >> >> and >> >> > a payload column for the metric-specific data. The metric_type + >> >> > schema_version pair gives us a forward-compatible contract for the >> >> payload >> >> > shape. Adding a new metric type becomes an enum value and a payload >> >> schema, >> >> > not a schema migration. One thing I think we need to be deliberate >> >> about is >> >> > the partition key design. If all metric types land in one table, scan >> >> > metrics at scale (high concurrency, high frequency across many >> tables) >> >> > could easily create hot partitions. We'd want the persistence layer >> to >> >> be >> >> > able to shard by entity or time range, and that means the logical >> schema >> >> > needs to expose enough structure for backends to partition on. I >> don't >> >> > think the current flattened layout gives us that. >> >> > >> >> > This is getting complex enough that I don't think ad-hoc PR/ML >> threads >> >> > will converge well. *Would people be open to a biweekly sync for >> metrics >> >> > architecture?* I think 30 minutes every two weeks with interested >> >> parties >> >> > would be enough to work through the schema, SPI shape, and read API >> >> design >> >> > together. Happy to help set that up. >> >> > >> >> > -ej >> >> > >> >> > On Mon, Apr 20, 2026 at 2:19 PM EJ Wang < >> [email protected] >> >> > >> >> > wrote: >> >> > >> >> >> Reviewed #4115, left a comment on the code organization side. >> >> >> >> >> >> One thing stood out: the metrics write path enters through >> >> >> PolarisMetricsManager on MetaStoreManager, but the new read path >> >> bypasses >> >> >> MetaStoreManager entirely and goes straight to BasePersistence via >> >> >> callContext.getMetaStore(). That means the read API only works for >> >> backends >> >> >> that implement BasePersistence. NoSQL and remote backends can't >> >> participate. >> >> >> >> >> >> Stepping back, I think the metrics subsystem is growing into >> something >> >> >> real (write + read + REST API + AuthZ + pagination) *but the >> >> persistence >> >> >> side is split across two layers in a way that's hard to extend*. I >> put >> >> >> together two diagrams to show what I mean (my best effort). >> >> >> >> >> >> *Current state* (Diagram 1): three interfaces at three different >> >> levels. >> >> >> The engine-facing SPI (PolarisMetricsReporter) is clean. But >> >> >> PolarisMetricsManager on MetaStoreManager is a passthrough to >> >> >> MetricsPersistence on BasePersistence. The @Beta annotation and SPI >> >> javadoc >> >> >> are on the BasePersistence layer, while the actual extension points >> >> >> (PolarisMetricsReporter, PolarisMetricsManager) carry no stability >> >> >> annotation. The write path goes through the MetaStoreManager layer, >> the >> >> >> read path doesn't. >> >> >> >> >> >> *What I envision* (Diagram 2): two SPIs at two levels. >> >> >> PolarisMetricsReporter stays as the engine-facing SPI. >> >> >> PolarisMetricsManager becomes the backend-facing SPI with both write >> >> and >> >> >> read methods at the MetaStoreManager level, where any backend (JDBC, >> >> NoSQL, >> >> >> remote) can implement them. MetricsPersistence on BasePersistence >> goes >> >> >> away. Where metrics actually land is an implementation detail, not a >> >> core >> >> >> interface. >> >> >> >> >> >> *Minor naming thing*: PolarisMetricsReporter is broader than what it >> >> >> actually handles. It only accepts Iceberg REST Catalog metrics >> >> (ScanReport, >> >> >> CommitReport via MetricsReport). Generic table metrics or >> operational >> >> >> metrics aren't in scope. Not blocking, but worth noting if the >> metrics >> >> >> surface expands. >> >> >> >> >> >> *Rough sketch of how to get there*: >> >> >> 1. Add read methods to PolarisMetricsManager (listScanReports, >> >> >> listCommitReports) with default no-op, same as the existing write >> >> methods. >> >> >> (Probably make PolarisMetricsManager more explicit on being Iceberg >> >> >> specific like package name or class name etc.) >> >> >> 2. Wire MetricsReportsService through MetaStoreManager instead of >> >> >> callContext.getMetaStore(). >> >> >> 3. Extract metrics persistence from JdbcBasePersistenceImpl into >> its >> >> >> own class. That file carries ~7 responsibilities, metrics being one >> of >> >> them. >> >> >> 4. Remove MetricsPersistence from BasePersistence. >> >> >> >> >> >> *None of this needs to happen in #4115. But if the direction makes >> >> sense, >> >> >> it would be good to align before the metrics surface grows further. >> >> Curious >> >> >> what others think.* >> >> >> >> >> >> *My mental model note*: Level 1 MetaStoreManager; level 2 >> transactional >> >> >> persistence; level 3 base persistence >> >> >> >> >> >> Diagram 1 >> >> >> < >> >> >> https://www.plantuml.com/plantuml/uml/bLHDR-Cs4BthLmpIYupw0zbkKQ1r3M-S7Bp8xhhM7WCOb3IM65EaGD9EX2RzxHrHb4CxRelwa4YSDu_lpOVcnZ9jzvM8BBS2uGjQpJC3dtHMSekPtMk44IpsMgEqa5XcCOhCZikQQLP1pR8TAp2n3ILhmZDP20m0fcIvUkAoW2qJXd9z1bpToO9BX3WXu0ucy5rpgGPNm0nW5_epUWtm2Ue3pn3kMOFQmKntGZW0BYtgBSi8k5A2QMwybJNMIbFiGSR9QZc4nUqIvikStF0jHprua5C-amge42aNt3R0f5JaaoivdV2Pkqbx4hee4ymOkBh5BTiB-_uIeGeo8zL8rPsPl4DktdEiK1jkB1NdZCRbrSTecDe_mlHbF0wvBmCkaOH5_S8a_TTTKI6-nmCAkEw4LpxsZ-LbYLKQFKMNOgf_wuM7_bV9gOer5SYMMksBSWXFcbi49KNZXNLicwfe3TETC7gPdPqI7uBcHMb1RSzYq34c6PDUM9mn8HRsUTZEiDBve3NjVZumBj0U7SS37mGO7vcwtiK-_pU7U7L_f-digo9YbhSwIfMRwIITKGXbxdIUTCGF1SeCJxloKsU-3k9ddRbX1eDq1q_fx1JbBGT0glVyXimDuP4TQ5qpCAmnGEj2s_6n5mtn1z-97-63itFQZLPO1Ev2tu_WF7Ju-VPc0Skg5bYXxBhkY1xpD7EM_7fyflSpIsqMgVth5xhVr4eQxWQ8enaSAJQSG16yFSDuJ798rrcXr_3n-lfdk7icQjEBmFujL7AodiP_Y4Z7-YxvtZNs4zMgpNTl6tF8sglyPsmqchrjvQ-m-aP94r-TwCA2Ka8upPJZwtvSpoYCXkYMZU2NXvRMBfq9P3i3Le4VAZUAlUZ_oPKsxPgY0Q_BSKLkyr9bhQhQrJjo_x3TPlIB0DPjnMfcIoYP0QaYw1a0fTKDr8fB6ntNuvmoL1ZGkXa69Njh43zf9GiGxHQrA_jDYWRSzF5--WmTVrN97_Sm8LbLUy_lGBmLanJjFkDlGkRqjA_4tm00 >> >> > >> >> >> : >> >> >> >> >> >> [image: image.png] >> >> >> >> >> >> Diagram 2 >> >> >> < >> >> >> https://www.plantuml.com/plantuml/uml/VLLDR-8m4BtdLupO2sWBLVU8AaGB7AXAbssGzb896SS9RXqxjKqBwkv_tt7iV43fdaZYDpFlpRmnOsE9jhjSH9PRmM31hERKm8scMsuPjJlDe0yheZDc8RR4iYWoBrmMH9CS2a9VICPYUy1OZN0YCy5Q0BCbYNhdCeEK28En8G8wCvbnoQ0R8_05Bc6bkLIz3X03p1zzH7zR-9ZfDquPt9C3qoNCX2yV4G2NbkcKu5jdgGJHt0GbZwnG6i-UP3TUpk5gM6Ldqke350eZUqzoCft3U9xWHvxoa5-7K4nF1J46EbEMafsmdrCBbQ44gVggy18IZrn_ph5asd1ZiIKdQSgueZvjXrQFSFrdC3YN-nXmBacxbGiYyLVxLaBtdhqn0LSzdBDhqQtQoOJeGyad3z0lUqnYgpGB6Ns8oVyta00Dy_WnX0tIOZ8v6SYxHll1TrH6aejAik-mh-AphVFCwSUQqFypElag5QRGFDjQKEd96K1P8QP41c9TzA_IIQyvdAWyv_RSiS3skb0_EzDDkK2v5xWF6MiGFlvhpFLcD2Dq2pml14gaF67eQkmd8gulDoC4kSOu6KVpkvlUJg1RTbWISU40RdBUUS_9XfRZ2dwxm_SW8LYFISgm_MnlDQ6M9P1gbKEc4X-2pH_FvJCkCqm9pbVjD6LrwdLeOrDWfOaqc8Wh9BE85oNKxkNQ6o4yGRy_Eae0G_G8tZv81d3bHDB23WOdisohVr3nh_j6lbSjbNaLRTc8UgtPbAU1J_tygOfZX9DWEJeHDvYx-qmSi5FgNLPZwHrHcUsncGQ5-skhUclpE5fo4ounpFauYrUbkU6ccfnxMvitwag4IyerhTxj8In_Oj1bDO4pQru674loYrGlULHLEGCjwJJ8gDoVZR8MxO4BT3IzRvIcAQKezC6xpziGnTyImrfEGyJI_OcKfgtxIvnTqFEMS17L9Z-jsARN5FmTheP7HtSdtOMT0B4GY2FYHXxgQmMtj2bRqiLFGapiVe1_QVKDrkqXcm83aFEXnMYCZ-xlyHy >> >> > >> >> >> : >> >> >> [image: image.png] >> >> >> >> >> >> -ej >> >> >> >> >> >> On Wed, Apr 15, 2026 at 8:22 AM Dmitri Bourlatchkov < >> [email protected]> >> >> >> wrote: >> >> >> >> >> >>> Hi All, >> >> >>> >> >> >>> Heads up: The current state of PR [4115] looks pretty solid to me. >> I >> >> >>> believe this PR is approaching a mergeable condition. >> >> >>> >> >> >>> Please post your reviews if you have any comments. >> >> >>> >> >> >>> [4115] https://github.com/apache/polaris/pull/4115 >> >> >>> >> >> >>> Thanks, >> >> >>> Dmitri. >> >> >>> >> >> >>> On Tue, Mar 3, 2026 at 3:29 PM Anand Kumar Sankaran via dev < >> >> >>> [email protected]> wrote: >> >> >>> >> >> >>> > Hi Yufei and Dmitri, >> >> >>> > >> >> >>> > Here is a proposal for the REST endpoints for metrics and events. >> >> >>> > >> >> >>> > https://github.com/apache/polaris/pull/3924/changes >> >> >>> > >> >> >>> > I did not see any precursors for raising a PR for proposals, so >> >> trying >> >> >>> > this. Please let me know what you think. >> >> >>> > >> >> >>> > - >> >> >>> > Anand >> >> >>> > >> >> >>> > From: Anand Kumar Sankaran <[email protected]> >> >> >>> > Date: Monday, March 2, 2026 at 10:25 AM >> >> >>> > To: [email protected] <[email protected]> >> >> >>> > Subject: Re: Polaris Telemetry and Audit Trail >> >> >>> > >> >> >>> > About the REST API, based on my use cases: >> >> >>> > >> >> >>> > >> >> >>> > 1. >> >> >>> > I want to be able to query commit metrics to track files added / >> >> >>> removed >> >> >>> > per commit, along with record counts. The ingestion pipeline that >> >> >>> writes >> >> >>> > this data is owned by us and we are guaranteed to write this >> >> >>> information >> >> >>> > for each write. >> >> >>> > 2. >> >> >>> > I want to be able to query scan metrics for read. I understand >> >> clients >> >> >>> do >> >> >>> > not fulfill this requirement. >> >> >>> > 3. >> >> >>> > I want to be able to query the events table (events are >> persisted) - >> >> >>> this >> >> >>> > may supersede #2, I am not sure yet. >> >> >>> > >> >> >>> > All this information is in the JDBC based persistence model and >> is >> >> >>> > persisted in the metastore. I currently don’t have a need to >> query >> >> >>> > prometheus or open telemetry. I do publish some events to >> Prometheus >> >> >>> and >> >> >>> > they are forwarded to our dashboards elsewhere. >> >> >>> > >> >> >>> > About the CLI utilities, I meant the admin user utilities. In >> one of >> >> >>> the >> >> >>> > earliest drafts of my proposal, Prashant mentioned that the >> metrics >> >> >>> tables >> >> >>> > can grow indefinitely and that a similar problem exists with the >> >> events >> >> >>> > table as well. We discussed that cleaning up of old records from >> >> both >> >> >>> > metrics tables and events tables can be done via a CLI utility. >> >> >>> > >> >> >>> > I see that Yufei has covered the discussion about datasources. >> >> >>> > >> >> >>> > - >> >> >>> > Anand >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> > From: Yufei Gu <[email protected]> >> >> >>> > Date: Friday, February 27, 2026 at 9:54 PM >> >> >>> > To: [email protected] <[email protected]> >> >> >>> > Subject: Re: Polaris Telemetry and Audit Trail >> >> >>> > >> >> >>> > This Message Is From an External Sender >> >> >>> > This message came from outside your organization. >> >> >>> > Report Suspicious< >> >> >>> > >> >> >>> >> >> >> https://us-phishalarm-ewt.proofpoint.com/EWT/v1/Iz9xO38YGHZK!YhNDZABkHi1B699ote2uMwpOZw8i0QMCGO2Szc-HshuABGhGvwPJcymE6G2oUUxtS8xDkSrtGTPm_I3QnVDHoLMk50m9v8z_nZKTkd-bnVUbreF1u0WnfV_X5eYevZl_$ >> >> >>> > > >> >> >>> > >> >> >>> > >> >> >>> > As I mentioned in >> >> >>> > >> >> >>> >> >> >> https://urldefense.com/v3/__https://github.com/apache/polaris/issues/3890__;!!Iz9xO38YGHZK!5EuyFFkk3vhRWVIRvQAWBSQfpJkTMA9HxugzDwXmN0LPPqhEFxYkFRGVhtb8AqUwXtDh2OplcMnbMDHKOxrvDU0$ >> >> >>> , >> >> >>> > supporting >> >> >>> > multiple data sources is not a trivial change. I would strongly >> >> >>> recommend >> >> >>> > starting with a design document to carefully evaluate the >> >> architectural >> >> >>> > implications and long term impact. >> >> >>> > >> >> >>> > A REST endpoint to query metrics seems reasonable given the >> current >> >> >>> JDBC >> >> >>> > based persistence model. That said, we may also consider >> alternative >> >> >>> > storage models. For example, if we later adopt a time series >> system >> >> >>> such as >> >> >>> > Prometheus to store metrics, the query model and access patterns >> >> would >> >> >>> be >> >> >>> > fundamentally different. Designing the REST API without >> considering >> >> >>> these >> >> >>> > potential evolutions may limit flexibility. I'd suggest to start >> >> with >> >> >>> the >> >> >>> > use case. >> >> >>> > >> >> >>> > Yufei >> >> >>> > >> >> >>> > >> >> >>> > On Fri, Feb 27, 2026 at 3:42 PM Dmitri Bourlatchkov < >> >> [email protected]> >> >> >>> > wrote: >> >> >>> > >> >> >>> > > Hi Anand, >> >> >>> > > >> >> >>> > > Sharing my view... subject to discussion: >> >> >>> > > >> >> >>> > > 1. Adding non-IRC REST API to Polaris is perfectly fine. >> >> >>> > > >> >> >>> > > Figuring out specific endpoint URIs and payloads might require >> a >> >> few >> >> >>> > > roundtrips, so opening a separate thread for that might be >> best. >> >> >>> > > Contributors commonly create Google Docs for new API proposals >> too >> >> >>> (they >> >> >>> > > fairly easy to update as the email discussion progresses). >> >> >>> > > >> >> >>> > > There was a suggestion to try Markdown (with PRs) for proposals >> >> [1] >> >> >>> ... >> >> >>> > > feel free to give it a try if you are comfortable with that. >> >> >>> > > >> >> >>> > > 2. Could you clarify whether you mean end user utilities or >> admin >> >> >>> user >> >> >>> > > utilities? In the latter case those might be more suitable for >> the >> >> >>> Admin >> >> >>> > > CLI (java) not the Python CLI, IMHO. >> >> >>> > > >> >> >>> > > Why would these utilities be common with events? IMHO, event >> use >> >> >>> cases >> >> >>> > are >> >> >>> > > distinct from scan/commit metrics. >> >> >>> > > >> >> >>> > > 3. I'd prefer separating metrics persistence from MetaStore >> >> >>> persistence >> >> >>> > at >> >> >>> > > the code level, so that they could be mixed and matched >> >> >>> independently. >> >> >>> > The >> >> >>> > > separate datasource question will become a non-issue with that >> >> >>> approach, >> >> >>> > I >> >> >>> > > guess. >> >> >>> > > >> >> >>> > > The rationale for separating scan metrics and metastore >> >> persistence >> >> >>> is >> >> >>> > that >> >> >>> > > "cascading deletes" between them are hardly ever required. >> >> >>> Furthermore, >> >> >>> > the >> >> >>> > > data and query patterns are very different so different >> >> technologies >> >> >>> > might >> >> >>> > > be beneficial in each case. >> >> >>> > > >> >> >>> > > [1] >> >> >>> > >> >> >>> >> >> >> https://urldefense.com/v3/__https://lists.apache.org/thread/yto2wp982t43h1mqjwnslswhws5z47cy__;!!Iz9xO38YGHZK!5EuyFFkk3vhRWVIRvQAWBSQfpJkTMA9HxugzDwXmN0LPPqhEFxYkFRGVhtb8AqUwXtDh2OplcMnbMDHKxYDakNU$ >> >> >>> > > >> >> >>> > > Cheers, >> >> >>> > > Dmitri. >> >> >>> > > >> >> >>> > > On Fri, Feb 27, 2026 at 6:19 PM Anand Kumar Sankaran via dev < >> >> >>> > > [email protected]> wrote: >> >> >>> > > >> >> >>> > > > Thanks all. This PR is merged now. >> >> >>> > > > >> >> >>> > > > Here are the follow-up features / work needed. These were >> all >> >> >>> part of >> >> >>> > > the >> >> >>> > > > merged PR at some point in time and were removed to reduce >> >> scope. >> >> >>> > > > >> >> >>> > > > Please let me know what you think. >> >> >>> > > > >> >> >>> > > > >> >> >>> > > > 1. A REST API to paginate through table metrics. This >> will be >> >> >>> > non-IRC >> >> >>> > > > standard addition. >> >> >>> > > > 2. Utilities for managing old records, should be common >> with >> >> >>> events. >> >> >>> > > > There was some discussion that it belongs to the CLI. >> >> >>> > > > 3. Separate datasource (metrics, events, even other >> tables?). >> >> >>> > > > >> >> >>> > > > >> >> >>> > > > Anything else? >> >> >>> > > > >> >> >>> > > > - >> >> >>> > > > Anand >> >> >>> > > > >> >> >>> > > > >> >> >>> > > >> >> >>> > >> >> >>> > >> >> >>> >> >> >> >> >> >> > >> >
