2018-11-16 22:25:59 UTC - Rajan Dhabalia: @Sijie Guo @Matteo Merli right now, It seems like Bookie doesn't provide jvm stats (eg: heap, direct memory, gc stats) like what pulsar has.. I don't see it in : <https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookKeeperServerStats.java>. so, I thinking to add jvm stats there..so, can you please let me know if it's not overlapping change in bookie.. ---- 2018-11-16 22:36:27 UTC - Matteo Merli: These are exported from Proemetheus stats plugin ---- 2018-11-16 22:38:37 UTC - Rajan Dhabalia: ok.. but what if someone who doesn't use Proemetheus stats then does it make sense to have it part of bookie-stats.. we are not using Proemetheus-stats and still need gc-stats ---- 2018-11-16 22:40:35 UTC - Sijie Guo: oh I think most of stats provider provides that. e.g. codahale, twitter finagle stats, those providers export those stats.
in yahoo’s stats library, do you guys have jvm stats? maybe you can enable that in your stats provider? ---- 2018-11-16 22:43:14 UTC - Rajan Dhabalia: I see.. so, every provider will have logic to fetch and aggregate jvm-stats.. but isn't it better to have at one common place (may be at bookkeeper-server-stats) if every provider requires it.. ---- 2018-11-16 22:43:46 UTC - Matteo Merli: The thing is that each provider will also have different namings ---- 2018-11-16 22:44:28 UTC - Matteo Merli: eg: in prometheus is convenient to keep the same metric name, so that a generic JVM dashboard will be able to visualize from each JVM process ---- 2018-11-16 22:45:12 UTC - Rajan Dhabalia: ok ---- 2018-11-17 00:42:45 UTC - Rajan Dhabalia: @Matteo Merli @Sijie Guo I can see direct-memory metrics in prometheus provider: <https://github.com/apache/bookkeeper/blob/master/bookkeeper-stats-providers/prometheus-metrics-provider/src/main/java/org/apache/bookkeeper/stats/prometheus/PrometheusMetricsProvider.java#L151> But I can't see in any other provider eg: codahale: <https://github.com/apache/bookkeeper/blob/master/bookkeeper-stats-providers/codahale-metrics-provider/src/main/java/org/apache/bookkeeper/stats/CodahaleMetricsProvider.java> Also, it seems we should have gc-metrics eg: young-gc-count/time, old-gc-count/time.. jmx mbeans provides it but it's always cumulative so, it will not give last minute-window metrics. so, I think it would be better if we can add gc metrics as well.. any thought? ---- 2018-11-17 00:43:27 UTC - Sijie Guo: <https://github.com/apache/bookkeeper/blob/master/bookkeeper-stats-providers/codahale-metrics-provider/src/main/java/org/apache/bookkeeper/stats/CodahaleMetricsProvider.java#L57> ---- 2018-11-17 00:43:32 UTC - Sijie Guo: @Rajan Dhabalia ---- 2018-11-17 00:44:25 UTC - Rajan Dhabalia: ok.. let me again verify if it gives gc-count and time ---- 2018-11-17 00:44:29 UTC - Matteo Merli: I had added the direct memory metric “directly” because since netty gets it through Unsafe it’s not being reported correctly otherwise ---- 2018-11-17 00:44:39 UTC - Sijie Guo: The difficult part is the jvm metrics is tied to a given gc algorithm ---- 2018-11-17 00:45:45 UTC - Matteo Merli: Yes, in broker we report the G1GC1 metric directly as well, but we get the regular GC stats through prometheus client lib as well ---- 2018-11-17 00:45:49 UTC - Rajan Dhabalia: that's true.. so, i was referring this pulsar jvm-metrics: <https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/stats/JvmMetrics.java#L93> ---- 2018-11-17 00:46:12 UTC - Rajan Dhabalia: which is kind of useful ---- 2018-11-17 00:46:45 UTC - Sijie Guo: I think we can add some common library for exposing those metrics, and let stats provider enable them if that works for you ---- 2018-11-17 00:47:01 UTC - Matteo Merli: we get the cumulative values in Prometheus for the pauses times and counts ---- 2018-11-17 00:47:31 UTC - Matteo Merli: that makes it easy to put the increase (in last minute) from previous interval ---- 2018-11-17 00:57:48 UTC - Rajan Dhabalia: cumulative value sometime is not convenient for many of the monitoring tools and specially while setting up the alerts.. because to get actual value for that minute, it requires to consider previous minute stats as well..and sometimes monitoring tools doesn't have capability to set alerts by deriving the value but it can alert on current absolute value which will not be helpful in cumulative stats case.. ---- 2018-11-17 00:58:27 UTC - Matteo Merli: Again, that’s why it’s kind of specific of the metric exporter plugin ---- 2018-11-17 00:58:43 UTC - Matteo Merli: :slightly_smiling_face: ---- 2018-11-17 00:58:46 UTC - Rajan Dhabalia: true.. ---- 2018-11-17 00:58:48 UTC - Rajan Dhabalia: ok ----