[prometheus-users] Re: black box exporter monitoring SSH and PING

nina guo Thu, 21 Apr 2022 01:22:39 -0700

*blackbox exporter config:*
icmp:
        prober: icmp
        icmp:
          preferred_ip_protocol: "ip4"
tcp:
        prober: tcp
        timeout: 5s
        tcp:
          preferred_ip_protocol: "ip4"


*Prometheus scrape config:*
global:
      scrape_interval: 60s
      evaluation_interval: 60s
- job_name: PING
        metrics_path: /probe
        params:
          module: [icmp]
        file_sd_configs:
        - files:
          - '/etc/prometheus/targets/'
        relabel_configs:
          - source_labels: [__address__]
            target_label: __param_target
            regex: '([^:]+)(:[0-9]+)?'
            replacement: '${1}'
          - source_labels: [__param_target]
            target_label: instance
          - target_label: __address__
            replacement: prometheus-blackbox-exporter:9115
      - job_name: SSH
        metrics_path: /probe
        params:
          module: [ssh_banner]
        file_sd_configs:
        - files:
          - '/etc/prometheus/targets/'
        relabel_configs:
          - source_labels: [__address__]
            target_label: __param_target
            regex: '([^:]+)(:[0-9]+)?'
            replacement: '${1}:22'
          - source_labels: [__param_target]
            target_label: instance
          - target_label: __address__
            replacement: prometheus-blackbox-exporter:9115

*Alert rules:*
- alert: TargetDown
          expr: probe_success == 0
          for: 5s
          labels:
            severity: critical
          annotations:
            description: Service {{ $labels.instance }} is unreachable.
            value: DOWN ({{ $value }})
            summary: "Target {{ $labels.instance }} is down."

*Alert manager config:*
config.yml: |-
    global:
      resolve_timeout: 5m
      smtp_smarthost: mail
      smtp_from: alertmanager
      smtp_require_tls: false
    route:
      receiver: email-me
      group_by: [instance, alertname, job]
      group_wait: 45s
      group_interval: 5m
      repeat_interval: 24h
    receivers:
    - name: email-me
      email_configs:
      - to: alert
        send_resolved: true

On Wednesday, April 20, 2022 at 8:29:10 PM UTC+8 Brian Candler wrote:

> blackbox_exporter monitoring TCP ports (e.g. for SSH) and ICMP (ping) 
> works fine.
>
> "but black box exporter detect the recover behavior after about 5mins"
>
> Black box exporter only performs a single test when you scrape it.  It 
> does not by itself do any recovery detection.  The problem is therefore 
> most likely with your prometheus scrape config or your alertmanager config.
>
> If you're having a problem, you'll need to be more specific:
> * show your blackbox_exporter config, your prometheus scrape config which 
> scrapes it, your alerting rules, and your alertmanager config (if using 
> alertmanager)
> * describe more clearly the behaviour you're seeing, and what you expected 
> to see.  (For example, are you waiting for a "recovery" E-mail from 
> alertmanager?)
>
> "And after the IP table is recovered, the alert for Ping can be cleared 
> after about 20mins, but SSH is still there."
>
> Either SSH is working and reachable, or it is not.  You can check the 
> results of blackbox_exporter tests by hand using curl, and also get 
> additional debugging information, like this:
>
> curl -g 'http://127.0.0.1:9115/probe?module=xxx&target=yyyy&debug=true'
>
> Here is an example:
>
> # *curl -g 
> 'http://localhost:9115/probe?module=icmp&target=1.2.3.4&debug=true 
> <http://localhost:9115/probe?module=icmp&target=1.2.3.4&debug=true>'*
> Logs for the probe:
> ts=2022-04-20T12:25:11.587855449Z caller=main.go:320 module=icmp 
> target=1.2.3.4 level=info msg="Beginning probe" probe=icmp timeout_seconds=3
> ts=2022-04-20T12:25:11.588014456Z caller=icmp.go:91 module=icmp 
> target=1.2.3.4 level=info msg="Resolving target address" ip_protocol=ip6
> ts=2022-04-20T12:25:11.588065658Z caller=icmp.go:91 module=icmp 
> target=1.2.3.4 level=info msg="Resolving target address" ip_protocol=ip4
> ts=2022-04-20T12:25:11.588098688Z caller=icmp.go:91 module=icmp 
> target=1.2.3.4 level=info msg="Resolved target address" ip=1.2.3.4
> ts=2022-04-20T12:25:11.588133368Z caller=main.go:130 module=icmp 
> target=1.2.3.4 level=info msg="Creating socket"
> ts=2022-04-20T12:25:11.588188673Z caller=main.go:130 module=icmp 
> target=1.2.3.4 level=debug msg="Unable to do unprivileged listen on socket, 
> will attempt privileged" err="socket: permission denied"
> ts=2022-04-20T12:25:11.58829848Z caller=main.go:130 module=icmp 
> target=1.2.3.4 level=info msg="Creating ICMP packet" seq=24581 id=190
> ts=2022-04-20T12:25:11.588348917Z caller=main.go:130 module=icmp 
> target=1.2.3.4 level=info msg="Writing out packet"
> ts=2022-04-20T12:25:11.588470176Z caller=main.go:130 module=icmp 
> target=1.2.3.4 level=info msg="Waiting for reply packets"
> ts=2022-04-20T12:25:14.588761946Z caller=main.go:130 module=icmp 
> target=1.2.3.4 level=debug msg="Cannot get TTL from the received packet. 
> 'probe_icmp_reply_hop_limit' will be missing."
> ts=2022-04-20T12:25:14.588979317Z caller=main.go:130 module=icmp 
> target=1.2.3.4 level=warn msg="Timeout reading from socket" err="read ip 
> 0.0.0.0: raw-read ip4 0.0.0.0: i/o timeout"
> ts=2022-04-20T12:25:14.589247538Z caller=main.go:320 module=icmp 
> target=1.2.3.4 level=error msg="Probe failed" duration_seconds=3.001307309
>
>
>
> Metrics that would have been returned:
> # HELP probe_dns_lookup_time_seconds Returns the time taken for probe dns 
> lookup in seconds
> # TYPE probe_dns_lookup_time_seconds gauge
> probe_dns_lookup_time_seconds 0.000116077
> # HELP probe_duration_seconds Returns how long the probe took to complete 
> in seconds
> # TYPE probe_duration_seconds gauge
> probe_duration_seconds 3.001307309
> # HELP probe_icmp_duration_seconds Duration of icmp request by phase
> # TYPE probe_icmp_duration_seconds gauge
> probe_icmp_duration_seconds{phase="resolve"} 0.000116077
> probe_icmp_duration_seconds{phase="rtt"} 0
> probe_icmp_duration_seconds{phase="setup"} 0.000212886
> # HELP probe_ip_addr_hash Specifies the hash of IP address. It's useful to 
> detect if the IP address changes.
> # TYPE probe_ip_addr_hash gauge
> probe_ip_addr_hash 3.268949123e+09
> # HELP probe_ip_protocol Specifies whether probe ip protocol is IP4 or IP6
> # TYPE probe_ip_protocol gauge
> probe_ip_protocol 4
> # HELP probe_success Displays whether or not the probe was a success
> # TYPE probe_success gauge
> probe_success 0
>
>
>
> Module configuration:
> prober: icmp
> timeout: 3s
> http:
>     ip_protocol_fallback: true
>     follow_redirects: true
> tcp:
>     ip_protocol_fallback: true
> icmp:
>     ip_protocol_fallback: true
> dns:
>     ip_protocol_fallback: true
>
>
> Look at "probe_success" for the overall result.
>
> You can also use the PromQL browser in the Prometheus web interface: enter 
> "probe_success" as the query and look at the graph tab. You'll see the 
> history of your blackbox exporter probes.
>
> On Wednesday, 20 April 2022 at 12:37:17 UTC+1 [email protected] wrote:
>
>> Hi guys,
>>
>> We are using black box exporter to monitor ssh and ping.
>>
>> For ssh, (we monitor the port 22) if we stop sshd service, actually the 
>> service will be auto-recovered, but black box exporter detect the recover 
>> behavior after about 5mins.
>>
>> For ping, we use icmp module to monitor system ping, we deleted the IP 
>> tables, then Prometheus triggered 2 alerts, one is SSH is failed, the other 
>> is Ping is failed. And after the IP table is recovered, the alert for Ping 
>> can be cleared after about 20mins, but SSH is still there.
>>
>> So it is a good approach to use blackbox exporter to monitor SSH and PING?
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/35d018ad-19d6-45f4-871c-0c82792d33c2n%40googlegroups.com.

[prometheus-users] Re: black box exporter monitoring SSH and PING

Reply via email to