Hi Anand, I think the proposal is moving in a better direction, especially on the Events side, and I appreciate the iteration so far. That said, I still have some concerns about the Metrics side, but they are less about individual parameters or endpoint shape, and more about product boundary.
2 cents: I wonder if it would help to evaluate the Events API and Metrics API a bit more independently. The Events side feels relatively close to Polaris' catalog/change-log scope. It is easier to justify as part of the core/community surface, especially if the goal is to expose completed catalog mutations in a way that aligns with Iceberg-style events. The Metrics side feels different to me. Once we start adding more and more type-specific filters, query semantics, and schema shape for individual metric families, it seems easy for Polaris to drift toward a built-in observability backend. My bias would be for Polaris to support a smaller set of community-recognized built-in metrics well, while providing good extensibility points for deployments that want richer querying, visualization, or use-case-specific metrics. Related to that, I am not yet convinced the current metrics model is generic enough as a long-term direction. Even after consolidating to a single endpoint, the design still feels fairly tied to the current scan/commit shape. I worry that otherwise each new metric family will keep pulling us into more storage/schema/API reshaping inside Polaris core. So the framing question I would suggest is something like: > What is the minimal built-in metrics surface Polaris should own in core, and where should we instead rely on extensibility / sink-export / plugin-style integration? For me, getting that boundary right matters more than settling every query parameter detail first. -ej On Tue, Mar 3, 2026 at 12:29 PM Anand Kumar Sankaran via dev < [email protected]> wrote: > Hi Yufei and Dmitri, > > Here is a proposal for the REST endpoints for metrics and events. > > https://github.com/apache/polaris/pull/3924/changes > > I did not see any precursors for raising a PR for proposals, so trying > this. Please let me know what you think. > > - > Anand > > From: Anand Kumar Sankaran <[email protected]> > Date: Monday, March 2, 2026 at 10:25 AM > To: [email protected] <[email protected]> > Subject: Re: Polaris Telemetry and Audit Trail > > About the REST API, based on my use cases: > > > 1. > I want to be able to query commit metrics to track files added / removed > per commit, along with record counts. The ingestion pipeline that writes > this data is owned by us and we are guaranteed to write this information > for each write. > 2. > I want to be able to query scan metrics for read. I understand clients do > not fulfill this requirement. > 3. > I want to be able to query the events table (events are persisted) - this > may supersede #2, I am not sure yet. > > All this information is in the JDBC based persistence model and is > persisted in the metastore. I currently don’t have a need to query > prometheus or open telemetry. I do publish some events to Prometheus and > they are forwarded to our dashboards elsewhere. > > About the CLI utilities, I meant the admin user utilities. In one of the > earliest drafts of my proposal, Prashant mentioned that the metrics tables > can grow indefinitely and that a similar problem exists with the events > table as well. We discussed that cleaning up of old records from both > metrics tables and events tables can be done via a CLI utility. > > I see that Yufei has covered the discussion about datasources. > > - > Anand > > > > From: Yufei Gu <[email protected]> > Date: Friday, February 27, 2026 at 9:54 PM > To: [email protected] <[email protected]> > Subject: Re: Polaris Telemetry and Audit Trail > > This Message Is From an External Sender > This message came from outside your organization. > Report Suspicious< > https://us-phishalarm-ewt.proofpoint.com/EWT/v1/Iz9xO38YGHZK!YhNDZABkHi1B699ote2uMwpOZw8i0QMCGO2Szc-HshuABGhGvwPJcymE6G2oUUxtS8xDkSrtGTPm_I3QnVDHoLMk50m9v8z_nZKTkd-bnVUbreF1u0WnfV_X5eYevZl_$ > > > > > As I mentioned in > https://urldefense.com/v3/__https://github.com/apache/polaris/issues/3890__;!!Iz9xO38YGHZK!5EuyFFkk3vhRWVIRvQAWBSQfpJkTMA9HxugzDwXmN0LPPqhEFxYkFRGVhtb8AqUwXtDh2OplcMnbMDHKOxrvDU0$, > supporting > multiple data sources is not a trivial change. I would strongly recommend > starting with a design document to carefully evaluate the architectural > implications and long term impact. > > A REST endpoint to query metrics seems reasonable given the current JDBC > based persistence model. That said, we may also consider alternative > storage models. For example, if we later adopt a time series system such as > Prometheus to store metrics, the query model and access patterns would be > fundamentally different. Designing the REST API without considering these > potential evolutions may limit flexibility. I'd suggest to start with the > use case. > > Yufei > > > On Fri, Feb 27, 2026 at 3:42 PM Dmitri Bourlatchkov <[email protected]> > wrote: > > > Hi Anand, > > > > Sharing my view... subject to discussion: > > > > 1. Adding non-IRC REST API to Polaris is perfectly fine. > > > > Figuring out specific endpoint URIs and payloads might require a few > > roundtrips, so opening a separate thread for that might be best. > > Contributors commonly create Google Docs for new API proposals too (they > > fairly easy to update as the email discussion progresses). > > > > There was a suggestion to try Markdown (with PRs) for proposals [1] ... > > feel free to give it a try if you are comfortable with that. > > > > 2. Could you clarify whether you mean end user utilities or admin user > > utilities? In the latter case those might be more suitable for the Admin > > CLI (java) not the Python CLI, IMHO. > > > > Why would these utilities be common with events? IMHO, event use cases > are > > distinct from scan/commit metrics. > > > > 3. I'd prefer separating metrics persistence from MetaStore persistence > at > > the code level, so that they could be mixed and matched independently. > The > > separate datasource question will become a non-issue with that approach, > I > > guess. > > > > The rationale for separating scan metrics and metastore persistence is > that > > "cascading deletes" between them are hardly ever required. Furthermore, > the > > data and query patterns are very different so different technologies > might > > be beneficial in each case. > > > > [1] > https://urldefense.com/v3/__https://lists.apache.org/thread/yto2wp982t43h1mqjwnslswhws5z47cy__;!!Iz9xO38YGHZK!5EuyFFkk3vhRWVIRvQAWBSQfpJkTMA9HxugzDwXmN0LPPqhEFxYkFRGVhtb8AqUwXtDh2OplcMnbMDHKxYDakNU$ > > > > Cheers, > > Dmitri. > > > > On Fri, Feb 27, 2026 at 6:19 PM Anand Kumar Sankaran via dev < > > [email protected]> wrote: > > > > > Thanks all. This PR is merged now. > > > > > > Here are the follow-up features / work needed. These were all part of > > the > > > merged PR at some point in time and were removed to reduce scope. > > > > > > Please let me know what you think. > > > > > > > > > 1. A REST API to paginate through table metrics. This will be > non-IRC > > > standard addition. > > > 2. Utilities for managing old records, should be common with events. > > > There was some discussion that it belongs to the CLI. > > > 3. Separate datasource (metrics, events, even other tables?). > > > > > > > > > Anything else? > > > > > > - > > > Anand > > > > > > > > > >
