[Wikitech-l] Re: Getting started with Prometheus?

Roy Smith Sun, 04 May 2025 17:16:47 -0700

It looks like prometheus-pushgateway.discovery.wmnet (as documented in 
https://wikitech.wikimedia.org/wiki/Prometheus#Ephemeral_jobs_(Pushgateway)) is 
not reachable from my VPS instance:


$ traceroute prometheus-pushgateway.discovery.wmnet
traceroute to prometheus-pushgateway.discovery.wmnet (10.64.0.82), 30 hops max, 
60 byte packets
 1  vlan-legacy.cloudinstances2b-gw.svc.eqiad1.wikimedia.cloud (172.16.0.1)  
0.657 ms  0.632 ms  0.563 ms
 2  vlan1107.cloudgw1004.eqiad1.wikimediacloud.org (185.15.56.234)  0.513 ms  
0.486 ms  0.440 ms
 3  * * *
 4  * * *
 5  * * *
 6  * * *
 7  * * *
 8  * * *
 9  * * *
10  * * *
11  * * *
12  * * *
13  * * *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

is that the correct host to be using?



> On May 4, 2025, at 5:39 PM, Roy Smith <r...@panix.com> wrote:
> 
> Thanks for the input.  Yes, in the statsd world, these are what I would have 
> called gauges.  HIstograms might be nice, but to get started, just the raw 
> gauges will be a useful improvement over what we have now, so I figure I'd 
> start with that.  And, yes, I expect I'll implement this in some python 
> scripts launched by cron under the toolforge jobs framework.
> 
> So, I guess if I wanted to do this on the command line, I would do:
> 
>   echo "some_metric 3.14" | curl --data-binary @- 
> http://prometheus-pushgateway.discovery.wmnet/???
> 
> where the ??? is the name of my job.  Do I just make up something that looks 
> reasonable, or is there some namespace that I get allocated for my metrics?
> 
> 
> 
>> On May 4, 2025, at 3:07 PM, Federico Leva (Nemo) <nemow...@gmail.com> wrote:
>> 
>> Il 01/05/25 20:17, Roy Smith ha scritto:
>>> I want a graph vs time..  Which is what statsd/graphite was good at, so I 
>>> assumed Prometheus would also be good at it.  Why is this silly?
>> 
>> It's not silly at all! If you use standard Prometheus metrics and some 
>> labels, you can later also get some basic statistical analysis for free on 
>> Grafana.
>> 
>> What you described is called a Prometheus exporter. It would take the raw 
>> data (from the MediaWiki API?) and output the metrics in Prometheus format. 
>> You can hand-craft the metrics even in bash, but probably something like 
>> Python or Rust where you have both MediaWiki and Prometheus libraries will 
>> be easiest.
>> 
>> The pushgateway is the traditional solution for a batch job like this. I 
>> don't know how authentication etc. is handled in WMF though.
>> 
>> The metrics you described are mostly gauges. For things like the time spent 
>> sitting in queues, you may want a histogram (so you can calculate e.g. the 
>> 75th percentile or the longest-waiting proposal). This is definitely best 
>> done with a Prometheus library (but make sure to manually set the buckets to 
>> some reasonable intervals, probably in terms of hours and days, otherwise 
>> you might get some unhelpful defaults starting from ms).
>> 
>> https://www.robustperception.io/how-does-a-prometheus-histogram-work/
>> https://prometheus.io/docs/practices/histograms/
>> 
>> Best,
>>      Federico
> 
> _______________________________________________
> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

_______________________________________________
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Re: Getting started with Prometheus?

Reply via email to