It looks like prometheus-pushgateway.discovery.wmnet (as documented in https://wikitech.wikimedia.org/wiki/Prometheus#Ephemeral_jobs_(Pushgateway)) is not reachable from my VPS instance:
$ traceroute prometheus-pushgateway.discovery.wmnet traceroute to prometheus-pushgateway.discovery.wmnet (10.64.0.82), 30 hops max, 60 byte packets 1 vlan-legacy.cloudinstances2b-gw.svc.eqiad1.wikimedia.cloud (172.16.0.1) 0.657 ms 0.632 ms 0.563 ms 2 vlan1107.cloudgw1004.eqiad1.wikimediacloud.org (185.15.56.234) 0.513 ms 0.486 ms 0.440 ms 3 * * * 4 * * * 5 * * * 6 * * * 7 * * * 8 * * * 9 * * * 10 * * * 11 * * * 12 * * * 13 * * * 14 * * * 15 * * * 16 * * * 17 * * * 18 * * * 19 * * * 20 * * * 21 * * * 22 * * * 23 * * * 24 * * * 25 * * * 26 * * * 27 * * * 28 * * * 29 * * * 30 * * * is that the correct host to be using? > On May 4, 2025, at 5:39 PM, Roy Smith <r...@panix.com> wrote: > > Thanks for the input. Yes, in the statsd world, these are what I would have > called gauges. HIstograms might be nice, but to get started, just the raw > gauges will be a useful improvement over what we have now, so I figure I'd > start with that. And, yes, I expect I'll implement this in some python > scripts launched by cron under the toolforge jobs framework. > > So, I guess if I wanted to do this on the command line, I would do: > > echo "some_metric 3.14" | curl --data-binary @- > http://prometheus-pushgateway.discovery.wmnet/??? > > where the ??? is the name of my job. Do I just make up something that looks > reasonable, or is there some namespace that I get allocated for my metrics? > > > >> On May 4, 2025, at 3:07 PM, Federico Leva (Nemo) <nemow...@gmail.com> wrote: >> >> Il 01/05/25 20:17, Roy Smith ha scritto: >>> I want a graph vs time.. Which is what statsd/graphite was good at, so I >>> assumed Prometheus would also be good at it. Why is this silly? >> >> It's not silly at all! If you use standard Prometheus metrics and some >> labels, you can later also get some basic statistical analysis for free on >> Grafana. >> >> What you described is called a Prometheus exporter. It would take the raw >> data (from the MediaWiki API?) and output the metrics in Prometheus format. >> You can hand-craft the metrics even in bash, but probably something like >> Python or Rust where you have both MediaWiki and Prometheus libraries will >> be easiest. >> >> The pushgateway is the traditional solution for a batch job like this. I >> don't know how authentication etc. is handled in WMF though. >> >> The metrics you described are mostly gauges. For things like the time spent >> sitting in queues, you may want a histogram (so you can calculate e.g. the >> 75th percentile or the longest-waiting proposal). This is definitely best >> done with a Prometheus library (but make sure to manually set the buckets to >> some reasonable intervals, probably in terms of hours and days, otherwise >> you might get some unhelpful defaults starting from ms). >> >> https://www.robustperception.io/how-does-a-prometheus-histogram-work/ >> https://prometheus.io/docs/practices/histograms/ >> >> Best, >> Federico > > _______________________________________________ > Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org > To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org > https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
_______________________________________________ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/