Hi All, 
I have a metric called go_service_status where  i use the "sum without" 
operator to determine whether a service is up or down on a server. Now 
there can be a situation where service can be down simultaneously on 2 
master servers and I am unable to figure out a PromQL query to detect that 
situation. Example -  

*go_service_status{SERVICETYPE="grade1",SERVER_CATEGORY="db1",instance=~"server1:7878"}*
and it can have 2 possible series -
go_service_status{HOSTNAME="server1", SERVER_CATEGORY="db1", 
SERVICETYPE="grade1", USER="admin", instance="server1:7878", 
job="customprocessexporter01"} 0
go_service_status{HOSTNAME="server1", SERVER_CATEGORY="db1", 
SERVICETYPE="grade1", USER="root", instance="server1:7878", 
job="customprocessexporter01"} 1

and in the same way
*go_service_status{SERVICETYPE="grade1",SERVER_CATEGORY="db1",instance=~"server2:7878"}*
and it can have 2 possible series -
go_service_status{HOSTNAME="server2", SERVER_CATEGORY="db1", 
SERVICETYPE="grade1", USER="admin", instance="server2:7878", 
job="customprocessexporter01"} 0
go_service_status{HOSTNAME="server2", SERVER_CATEGORY="db1", 
SERVICETYPE="grade1", USER="root", instance="server2:7878", 
job="customprocessexporter01"} 0  


Here;s the query using which i figure out status of the service on 
server1.  Example - 

(sum without (USER) (
*go_service_status{HOSTNAME="server1",SERVER_CATEGORY="db1",SERVICETYPE="grade1"}*)
 
< 1)[image: Untitled.png]

so the server1's service is momentarily 0


and server2's service is always down , example - 
(sum without (USER) (
*go_lsf_service_status{HOSTNAME="server2",SERVER_CATEGORY="db1",SERVICETYPE="grade1"}*)
 
< 1)[image: Untitled.png]


Now i tried to find the time duration where both these service were 
simultaneously down / 0 on both server1 and server2 :
(sum without (USER) (
*go_service_status{HOSTNAME="server1",SERVER_CATEGORY="db1",SERVICETYPE="grade1"}*)
 
< 1) and (sum without (USER) (
*go_service_status{HOSTNAME="server2",SERVER_CATEGORY="db1",SERVICETYPE="grade1"}*)
 
< 1)


I was expecting a graph similar to the once for server2 , but i got :
[image: Untitled.png]

I think i need to ignore the HOSTNAME label , but unable to figure out the 
way to ignore the HOSTNAME label in combination with sum without clause.

Any help/hint to improve this query will be very useful for me to 
understand the and condition in context of sum without  clause.

Thanks,
Puneet

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/ba97a288-4344-4b4e-b901-807e697440c9n%40googlegroups.com.

Reply via email to