Hi Anand,

I think the proposal is moving in a better direction, especially on the
Events side, and I appreciate the iteration so far. That said, I still have
some concerns about the Metrics side, but they are less about individual
parameters or endpoint shape, and more about product boundary.

2 cents: I wonder if it would help to evaluate the Events API and Metrics
API a bit more independently.

The Events side feels relatively close to Polaris' catalog/change-log
scope. It is easier to justify as part of the core/community surface,
especially if the goal is to expose completed catalog mutations in a way
that aligns with Iceberg-style events.

The Metrics side feels different to me. Once we start adding more and more
type-specific filters, query semantics, and schema shape for individual
metric families, it seems easy for Polaris to drift toward a built-in
observability backend. My bias would be for Polaris to support a smaller
set of community-recognized built-in metrics well, while providing good
extensibility points for deployments that want richer querying,
visualization, or use-case-specific metrics.

Related to that, I am not yet convinced the current metrics model is
generic enough as a long-term direction. Even after consolidating to a
single endpoint, the design still feels fairly tied to the current
scan/commit shape. I worry that otherwise each new metric family will keep
pulling us into more storage/schema/API reshaping inside Polaris core.
So the framing question I would suggest is something like:
> What is the minimal built-in metrics surface Polaris should own in core,
and where should we instead rely on extensibility / sink-export /
plugin-style integration?

For me, getting that boundary right matters more than settling every query
parameter detail first.

-ej

On Tue, Mar 3, 2026 at 12:29 PM Anand Kumar Sankaran via dev <
[email protected]> wrote:

> Hi Yufei and Dmitri,
>
> Here is a proposal for the REST endpoints for metrics and events.
>
> https://github.com/apache/polaris/pull/3924/changes
>
> I did not see any precursors for raising a PR for proposals, so trying
> this.  Please let me know what you think.
>
> -
> Anand
>
> From: Anand Kumar Sankaran <[email protected]>
> Date: Monday, March 2, 2026 at 10:25 AM
> To: [email protected] <[email protected]>
> Subject: Re: Polaris Telemetry and Audit Trail
>
> About the REST API, based on my use cases:
>
>
>   1.
> I want to be able to query commit metrics to track files added / removed
> per commit, along with record counts. The ingestion pipeline that writes
> this data is owned by us and we are guaranteed to write this information
> for each write.
>   2.
> I want to be able to query scan metrics for read. I understand clients do
> not fulfill this requirement.
>   3.
> I want to be able to query the events table (events are persisted) - this
> may supersede #2, I am not sure yet.
>
> All this information is in the JDBC based persistence model and is
> persisted in the metastore. I currently don’t have a need to query
> prometheus or open telemetry. I do publish some events to Prometheus and
> they are forwarded to our dashboards elsewhere.
>
> About the CLI utilities, I meant the admin user utilities. In one of the
> earliest drafts of my proposal, Prashant mentioned that the metrics tables
> can grow indefinitely and that a similar problem exists with the events
> table as well. We discussed that cleaning up of old records from both
> metrics tables and events tables can be done via a CLI utility.
>
> I see that Yufei has covered the discussion about datasources.
>
> -
> Anand
>
>
>
> From: Yufei Gu <[email protected]>
> Date: Friday, February 27, 2026 at 9:54 PM
> To: [email protected] <[email protected]>
> Subject: Re: Polaris Telemetry and Audit Trail
>
> This Message Is From an External Sender
> This message came from outside your organization.
> Report Suspicious<
> https://us-phishalarm-ewt.proofpoint.com/EWT/v1/Iz9xO38YGHZK!YhNDZABkHi1B699ote2uMwpOZw8i0QMCGO2Szc-HshuABGhGvwPJcymE6G2oUUxtS8xDkSrtGTPm_I3QnVDHoLMk50m9v8z_nZKTkd-bnVUbreF1u0WnfV_X5eYevZl_$
> >
>
>
> As I mentioned in
> https://urldefense.com/v3/__https://github.com/apache/polaris/issues/3890__;!!Iz9xO38YGHZK!5EuyFFkk3vhRWVIRvQAWBSQfpJkTMA9HxugzDwXmN0LPPqhEFxYkFRGVhtb8AqUwXtDh2OplcMnbMDHKOxrvDU0$,
> supporting
> multiple data sources is not a trivial change. I would strongly recommend
> starting with a design document to carefully evaluate the architectural
> implications and long term impact.
>
> A REST endpoint to query metrics seems reasonable given the current JDBC
> based persistence model. That said, we may also consider alternative
> storage models. For example, if we later adopt a time series system such as
> Prometheus to store metrics, the query model and access patterns would be
> fundamentally different. Designing the REST API without considering these
> potential evolutions may limit flexibility. I'd suggest to start with the
> use case.
>
> Yufei
>
>
> On Fri, Feb 27, 2026 at 3:42 PM Dmitri Bourlatchkov <[email protected]>
> wrote:
>
> > Hi Anand,
> >
> > Sharing my view... subject to discussion:
> >
> > 1. Adding non-IRC REST API to Polaris is perfectly fine.
> >
> > Figuring out specific endpoint URIs and payloads might require a few
> > roundtrips, so opening a separate thread for that might be best.
> > Contributors commonly create Google Docs for new API proposals too (they
> > fairly easy to update as the email discussion progresses).
> >
> > There was a suggestion to try Markdown (with PRs) for proposals [1] ...
> > feel free to give it a try if you are comfortable with that.
> >
> > 2. Could you clarify whether you mean end user utilities or admin user
> > utilities? In the latter case those might be more suitable for the Admin
> > CLI (java) not the Python CLI, IMHO.
> >
> > Why would these utilities be common with events? IMHO, event use cases
> are
> > distinct from scan/commit metrics.
> >
> > 3. I'd prefer separating metrics persistence from MetaStore persistence
> at
> > the code level, so that they could be mixed and matched independently.
> The
> > separate datasource question will become a non-issue with that approach,
> I
> > guess.
> >
> > The rationale for separating scan metrics and metastore persistence is
> that
> > "cascading deletes" between them are hardly ever required. Furthermore,
> the
> > data and query patterns are very different so different technologies
> might
> > be beneficial in each case.
> >
> > [1]
> https://urldefense.com/v3/__https://lists.apache.org/thread/yto2wp982t43h1mqjwnslswhws5z47cy__;!!Iz9xO38YGHZK!5EuyFFkk3vhRWVIRvQAWBSQfpJkTMA9HxugzDwXmN0LPPqhEFxYkFRGVhtb8AqUwXtDh2OplcMnbMDHKxYDakNU$
> >
> > Cheers,
> > Dmitri.
> >
> > On Fri, Feb 27, 2026 at 6:19 PM Anand Kumar Sankaran via dev <
> > [email protected]> wrote:
> >
> > > Thanks all. This PR is merged now.
> > >
> > > Here are the follow-up features / work needed.  These were all part of
> > the
> > > merged PR at some point in time and were removed to reduce scope.
> > >
> > > Please let me know what you think.
> > >
> > >
> > >   1.  A REST API to paginate through table metrics. This will be
> non-IRC
> > > standard addition.
> > >   2.  Utilities for managing old records, should be common with events.
> > > There was some discussion that it belongs to the CLI.
> > >   3.  Separate datasource (metrics, events, even other tables?).
> > >
> > >
> > > Anything else?
> > >
> > > -
> > > Anand
> > >
> > >
> >
>
>

Reply via email to