[prometheus-users] Re: Snmp Exporter - scrape timeout for Huawei Core Router which has more than 3000 interfaces

Brian Candler Sat, 26 Nov 2022 02:05:35 -0800

Most likely this a bug in your device.  A scrape duration of 564 seconds is 
bad, and getting 13715 SNMP PDUs in a single scrape is bad.  Maybe there is 
some sort of loop in its responses.


Can you do the same walks using snmpbulkwalk?  You can find the OIDs to 
walk in snmp.yml under the "huawei" module.  Then if you can find the 
particular subtree causing the problem, you can disable it.

If your are not using SNMPv3 with privacy (authPriv), then you can also use 
tcpdump to decode the packets and show you what's going on:
tcpdump -i eth0 -nn -s0 -v host 10.85.12.1 and udp port 161

On Saturday, 26 November 2022 at 08:01:44 UTC [email protected] wrote:

> Hi All,
>
> I've tried different timeout interval like 10s, 1m, 5m and max_repetitions 
> like 25, 20, 10 but I couldn't solve the problem. What should I use 
> prometheus scrape interval, snmp_exporter timeout interval and 
> max_repetitions?
>
> Debug:
> ts=2022-11-26T07:53:11.587Z caller=scrape.go:1343 level=debug 
> component="scrape manager" scrape_pool=mtx target="
> http://10.86.35.25:30020/snmp?module=huawei&target=10.85.12.1"; 
> msg="Scrape failed" err="Get \"
> http://10.86.35.25:30020/snmp?module=huawei&target=10.85.12.1\": context 
> deadline exceeded"
> level=info ts=2022-11-26T07:53:15.747Z caller=collector.go:224 
> module=huawei target=10.85.12.1 msg="Error scraping target" err="scrape 
> canceled (possible timeout) walking target 10.85.12.1"
>
> curl -s -XGET '
> http://10.86.35.25:30020/snmp?target=10.85.12.1&module=huawei'
> snmp_scrape_duration_seconds 564.382868812 
>
> OUTPUT:
> # HELP snmp_scrape_duration_seconds Total SNMP time scrape took (walk and 
> processing).
> # TYPE snmp_scrape_duration_seconds gauge
> snmp_scrape_duration_seconds 564.382868812
> # HELP snmp_scrape_pdus_returned PDUs returned from walk.
> # TYPE snmp_scrape_pdus_returned gauge
> snmp_scrape_pdus_returned 13715
> # HELP snmp_scrape_walk_duration_seconds Time SNMP walk/bulkwalk took.
> # TYPE snmp_scrape_walk_duration_seconds gauge
> snmp_scrape_walk_duration_seconds 564.285584201
> # HELP sysName An administratively-assigned name for this managed node - 
> 1.3.6.1.2.1.1.5
> # TYPE sysName gauge
> sysName{sysName="2886(1)-PTN3221412_MEDYAPARK"} 1
> # HELP sysUpTime The time (in hundredths of a second) since the network 
> management portion of the system was last re-initialized. - 1.3.6.1.2.1.1.3
> # TYPE sysUpTime gauge
> sysUpTime 2.512855182e+09
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/c4de6ed7-566e-4eed-89b1-3ca8e3bb17bdn%40googlegroups.com.

[prometheus-users] Re: Snmp Exporter - scrape timeout for Huawei Core Router which has more than 3000 interfaces

Reply via email to