Hi all, I'm running into a weird issue that I can't quite get my head around. As part of monitoring devices we use nmap to issue icmp echo requests to a list of target devices, and obviously the lack of a reply is worthy of generating an event/alert/alarm.
I am currently seeing sporadic events from a variety of target devices, which clear on the next polling cycle (i.e. we miss one response but get the following one sixty seconds later). Each cycle is a new nmap process with a list of IP addresses, and the "missed" responses are just one or two hosts in that list. Note that this is only affecting one monitoring environment of many, so my initial suspicion was that it was a network issue, hence the comparison between the displayed data and a packet capture. To try and get a simpler test case I used nping to repeatedly poll one device, and that gave some odd results which sort of correlate with the above behaviour: - Occasionally nping will simply stop seeing responses after some number (as low as 3) of successful requests (i.e. we just don't get any RCVD packets printed, and packet loss is 100% after the first few). - More often it will start reporting RCVD packets *far* too late, up to minutes after the response has been seen in tcpdump, eventually declaring that it "missed" the last several responses when it finishes a set of requests. - Seems to be "less reliable" with extended delay times (10s is usually enough to cause the issue). It's possible that this is an entirely spurious correlation (unless there are plenty of shared libraries), but it's harder to make a simple, reproducible, nmap command. As an example: (I've trimmed some of the repeated data to make this easier to read) Running "nping 10.192.1.16 -delay 10s -c30" was fine for the first 25 requests, and then... SENT (240.2806s) ICMP [10.206.16.30 > 10.192.1.16 Echo request seq=25] IP [ttl=64] RCVD (240.2821s) ICMP [10.192.1.16 > 10.206.16.30 Echo reply seq=25] IP [ttl=253] SENT (250.2886s) ICMP [10.206.16.30 > 10.192.1.16 Echo request seq=26] IP [ttl=64] SENT (260.2992s) ICMP [10.206.16.30 > 10.192.1.16 Echo request seq=27] IP [ttl=64] RCVD (260.2997s) ICMP [10.192.1.16 > 10.206.16.30 Echo reply seq=26] IP [ttl=253] SENT (270.3099s) ICMP [10.206.16.30 > 10.192.1.16 Echo request seq=28] IP [ttl=64] SENT (280.3207s) ICMP [10.206.16.30 > 10.192.1.16 Echo request seq=29] IP [ttl=64] RCVD (280.3211s) ICMP [10.192.1.16 > 10.206.16.30 Echo reply seq=27] IP [ttl=253] SENT (290.3313s) ICMP [10.206.16.30 > 10.192.1.16 Echo request seq=30] IP [ttl=64] Note that the RCVD packets are apparently delayed by longer and longer intervals Despite the packet trace showing ~1ms response times for all 30 requests. It looks likely that the printouts of RCVD are in some way tied to the delay, since they all appear at a 10s interval, just not necessarily the correct interval. But after the 30th packet (which has an immediate response in the packet capture) the summary data shows: Max rtt: 1.321ms | Min rtt: 0.343ms | Avg rtt: 1.089ms Raw packets sent: 30 (840B) | Rcvd: 27 (756B) | Lost: 3 (10.00%) Nping done: 1 IP address pinged in 290.36 seconds So despite the delay in printing the RCVD packet nmap correctly *measured* sequence 26 and 27, but then shut down before processing 28/29/30. The collector system is running ubuntu 20.04, with nmap 7.80, and nping 0.7.80 I appreciate that this isn't quite the latest, but I haven't seen anything obvious in the changelog. Trying not to upgrade in place I decided to try a standalone docker image (instrumentisto/nmap), because that's nice and easy to jump between versions for testing... Whilst I continue to get the same issue on the host or in our regular containers I cannot reproduce it with the standalone docker image using either version 7.80 or 7.94. I'm now not sure what to look at next... -- John Robson Sr. Customer Support Engineer, Zenoss jrob...@zenoss.com _______________________________________________ Sent through the dev mailing list https://nmap.org/mailman/listinfo/dev Archived at https://seclists.org/nmap-dev/