Thanks for the proposal. I think this will be a valuable addition, and I wonder if it makes sense to add a similar proposal to optionally monitor the binary protocol in the same way.
One minor question about naming. I see that we have the following boolean configurations with "metrics" in the name: private boolean replicationMetricsEnabled = true; private boolean authenticateMetricsEndpoint = false; private boolean exposeTopicLevelMetricsInPrometheus = true; private boolean metricsBufferResponse = false; private boolean exposeConsumerLevelMetricsInPrometheus = false; private boolean exposeProducerLevelMetricsInPrometheus = false; private boolean exposeManagedLedgerMetricsInPrometheus = true; private boolean exposeManagedCursorMetricsInPrometheus = false; private boolean exposeBundlesMetricsInPrometheus = false; I notice the trend as "expose.*MetricsInPrometheus". What do you think about changing "enablePerRestEndpointMetrics" to "exposePerRestEndpointMetricsInPrometheus"? Thanks, Michael On Thu, Nov 24, 2022 at 11:30 PM Haiting Jiang <jianghait...@gmail.com> wrote: > > +1, Great work. > > > Haiting > > On Thu, Nov 24, 2022 at 6:25 PM Jiuming Tao > <jm...@streamnative.io.invalid> wrote: > > > > Hi Haiting, > > > > I’ll update the PIP as we discussed: > > > > 1. `pulsar_broker_rest_endpoint_failed` > > Labels: > > a. path: The HTTP request path > > b. method: The HTTP request method > > c. code: The HTTP response code > > This metric only records HTTP requests which response code >= 400 > > > > 2. `pulsar_broker_rest_endpoint_latency` > > Labels: > > a. path: The HTTP request path > > b. method: The HTTP request method > > c. code: The HTTP response code > > This metric records all the HTTP requests, including failed requests. > > > > > > Thanks, > > Tao Jiuming > > > > Haiting Jiang <jianghait...@gmail.com> 于2022年11月24日周四 16:35写道: > > > > > Hi Jiuming, > > > > > > > I’m not sure that if observe failed requests latency is meaningful. > > > > > > 1. Normally for failed requests, the latency sometimes will help in > > > debugging. > > > 2. Add httpcode label to latency, we can analyze the latency between > > > different successful results too, like 200 and 307. > > > > > > As for `pulsar_broker_rest_endpoint_failed`, after a second thought. I > > > think it would be useful for operators to set up system alarms. So +1 > > > on this. > > > > > > Thanks, > > > Haiting > > > > > > On Thu, Nov 24, 2022 at 12:52 AM Jiuming Tao > > > <jm...@streamnative.io.invalid> wrote: > > > > > > > > Hi Haiting, > > > > > > > > I’m not sure that if observe failed requests latency is meaningful, so > > > that I was added `pulsar_broker_rest_endpoint_failed` to record these > > > requests which is failed. > > > > > > > > Thanks, > > > > Tao Jiuming > > > > > > > > > 2022年11月23日 下午12:55,Haiting Jiang <jianghait...@gmail.com> 写道: > > > > > > > > > > Hi Jiuming, > > > > > > > > > > I am not sure why we need a new metric for failed requests. > > > > > Can we just put the `httpcode` tag also in the > > > > > `pulsar_broker_rest_endpoint_latency_ms`? > > > > > So that user can also see the latency of failed requests (like timeout > > > > > requests). > > > > > And it can easily cover the case of failed metrics. > > > > > > > > > > Thanks, > > > > > Haiting > > > > > > > > > > On Wed, Nov 23, 2022 at 12:00 PM Jiuming Tao > > > > > <jm...@streamnative.io.invalid> wrote: > > > > >> > > > > >> Hi Penghui, > > > > >> > > > > >>> We'd better add an option to enable or disable the endpoint-level > > > metrics. > > > > >>> And keep it as disabled as default. > > > > >> > > > > >> Agreed, I also considered if we need a configuration to > > > enable/disable the feature, seems add an option is better > > > > >> > > > > >>> So that we can > > > > >>> get the request path from the handle method > > > > >>> `public void handle(String path, Request baseRequest, > > > HttpServletRequest > > > > >>> request, HttpServletResponse response)` > > > > >> > > > > >> I’ve tried this, but the `path` is the request path like > > > `/user/111/query`, and we cannot get the path configured by @Path by using > > > `Handler`. > > > > >> Jetty is a Servlet container, and we are using Jersey to provide Rest > > > API services, I think use Jersey’s mechanism is better > > > > >> > > > > >> Thanks, > > > > >> Tao Jiuming > > > > >> > > > > >> > > > > >> > > > > >>> 2022年11月23日 上午9:37,PengHui Li <peng...@apache.org> 写道: > > > > >>> > > > > >>> Hi, Jiuming > > > > >>> > > > > >>> Thanks for starting the proposal. > > > > >>> > > > > >>> We'd better add an option to enable or disable the endpoint-level > > > metrics. > > > > >>> And keep it as disabled as default. > > > > >>> > > > > >>> I noticed the existing jetty metrics are based on > > > > >>> `org.eclipse.jetty.server.handler.StatisticsHandler`. > > > > >>> Can we just have a new StatisticsHandler? e.g. > > > EndpointStatisticsHandler. > > > > >>> So that we can > > > > >>> get the request path from the handle method > > > > >>> `public void handle(String path, Request baseRequest, > > > HttpServletRequest > > > > >>> request, HttpServletResponse response)` > > > > >>> > > > > >>> Thanks, > > > > >>> Penghui > > > > >>> > > > > >>> On Tue, Nov 22, 2022 at 5:20 PM Jiuming Tao > > > <jm...@streamnative.io.invalid> > > > > >>> wrote: > > > > >>> > > > > >>>> Hi Haiting, > > > > >>>> > > > > >>>> I'm sorry I didn't explain FAILED, the FAILED means the HTTP > > > response code > > > > >>>>> = 400, and I’ll update the PIP later. > > > > >>>> > > > > >>>> Thanks, > > > > >>>> Tao Jiuming > > > > >>>> > > > > >>>>> 2022年11月22日 下午5:15,Haiting Jiang <jianghait...@gmail.com> 写道: > > > > >>>>> > > > > >>>>> Hi Jiuming, > > > > >>>>> > > > > >>>>> Overall, this PIP makes sense to me. > > > > >>>>> About the metric "pulsar_broker_rest_endpoint_failed", please > > > provide > > > > >>>>> a more clear definition of "fail". Are redirects like 403 > > > > >>>>> included? > > > > >>>>> > > > > >>>>> Thanks, > > > > >>>>> Haiting > > > > >>>>> > > > > >>>>> On Tue, Nov 22, 2022 at 3:17 AM Jiuming Tao > > > > >>>>> <jm...@streamnative.io.invalid> wrote: > > > > >>>>>> > > > > >>>>>> Hi pulsar community, > > > > >>>>>> > > > > >>>>>> I’ve opened a PIP to discuss: PIP-223: Add metrics for all Rest > > > > >>>> Endpoints > > > > >>>>>> > > > > >>>>>> The PIP link: https://github.com/apache/pulsar/issues/18560 < > > > > >>>> https://github.com/apache/pulsar/issues/18560> > > > > >>>>>> > > > > >>>>>> Thanks, > > > > >>>>>> Tao Jiuming > > > > >>>> > > > > >>>> > > > > >> > > > > > > >