Hi Aaron,
Please see below.
On 5/11/20 9:29 PM, Aaron Scamehorn wrote:
Hi Emanuele,
Thank you again for the detailed responses.
From the interfaces page, I see these stats:
Total Traffic 91.6 GB [103,062,265 Pkts] Dropped Packets 0 Pkts
I don't see any dropped packets on the NIC either:
ethtool -S enp2s0
NIC statistics:
tx_packets: 0
rx_packets: 106581943
tx_errors: 0
rx_errors: 0
rx_missed: 0
align_errors: 0
tx_single_collisions: 0
tx_multi_collisions: 0
unicast: 105432876
broadcast: 350738
multicast: 1149060
tx_aborted: 0
tx_underrun: 0
As of right now, 2 of the hosts we are discussing are still in alert,
at the original Date/Time of 07:25:01, and Duration is now "3 Days,
08:06:59".
Given that my replies vs requests ratio is still configured at 50%,
this means that, at every 5 minute interval for the last 3 Days, 8
hours, said host is receiving < 50% DNS replies, correct? I find this
difficult to believe, and cannot find ANY missing packets in my pcap file.
I have captured a 30 minute pcap file captured with this command:
tcpdump -i enp2s0 -G 1800 -w /tmp/enp2s0.%FT%T.pcap host edgemax and
port 53
This file contains DNS traffic to/from edgemax only.
I can count responses like this:
tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c "Standard
query response"
349
And queries like this:
tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c "Standard
query 0x"
349
In other words, no missing DNS responses in the 30 minutes spanning
13:00:02 to 13:29:51.
I would think that the alert should "clear" because the threshold is
not exceeded within that 30 minute pcap file.
In any case, at 13:23, I manually click on the "Release" button for
that alert. 2 minutes later, at 13:25:00, I receive this alert:
Host edgemax has received 62 DNS requests but sent 0 DNS replies [5
Minutes ratio: 0%]
As stated previously, no missing DNS responses in the 30 minutes
spanning 13:00:02 to 13:29:51. Why does ntopng think 62 replies are
missing?
Please report your ntopng.conf. If you look at the active ntopng DNS
flows, can you identify unidirectional flows? You can also try to run
ntopng on the PCAP file (--original-speed -i file.pcap). If you can
reproduce using the PCAP file, please send it to me privately so that I
can troubleshoot the problem.
I exported 10 minutes of PCAP from if_stats.lua. Using the filter
"(ip.dst_host == "10.12.17.1" or ip.src_host == "10.12.17.1") and dns"
I am not able to find any missing DNS responses in wireshark.
Interestingly, If I specify a BPF Filter ("port 53"), the downloaded
PCAP file seems to only have 1 side (ie. edgemax is only a source,
never a dest. Without a BPF Filter, the download is fine.
This is probably a bug, please open an issue at
https://github.com/ntop/ntopng .
Regards,
Emanuele
On Mon, May 11, 2020 at 8:59 AM Emanuele Faranda <fara...@ntop.org
<mailto:fara...@ntop.org>> wrote:
Hi Aaron,
Please see below:
On 5/8/20 10:27 PM, Aaron Scamehorn wrote:
Thank you for your response. In the screenshot below, can you
please explain the significance of the "Date/Time" and the
"Duration" columns? What do they mean in this context?
Date/Time: the time when the alert was triggered. Ntopng performs
periodic checks in order to trigger alerts. In this particular
case, the check on the requests/reply ratio is performed every 5
minutes. So this means that problem started between 07:20 and 07:25 .
Duration: the total time in which the problem was active. Again,
the check is performed every 5 minutes for this alert so 5 minutes
is the granularity.
Do I understand correctly that all 3 hosts triggered the alert at
07:25:01 (OR 07:30:01) this morning? And that all three alerts
are active for the past 07:28:53 hours? Does this mean that
there have been no new additional DNS Reply/Request issues have
been detected?
As explained above, the problem started between 07:20 and 07:25 .
For 07:28:53 hours the problem was active on all the three hosts
(the requests/reply ratio threshold was exceeded for 07:28:53 hours).
I notice in "Past Alerts" tab, that there are many Reply/Request
Alerts for the same host with very short durations (screen shot
#2). When/how does an alert move from the "Engaged" to "Past" tab?
In this case, the engaged alert becomes "past" alert when, after
the check performed every 5 minutes, the requests/reply ratio
threshold is not exceed anymore. This can happen as soon as the
next check is performed (5 minutes).
So in the 2nd screenshot, fire-TV had an alert at 06:20:00 for
05:00 minutes where 18 requests received 0 replies. Then another
alert at 06:50:00 for 05:00 minutes. Were the 18 replies from
the first alert ultimately received? And they were received 5
minutes the alert occurred?
The check is performed on the DNS packet counters. A DNS request
cannot take 5 minutes to be replied. The fact that the alert was
closed after 5/10 minutes could be related to one of these events:
- The host went idle
- The host did not send enough DNS requests
- The new DNS requests made by the host were successfully replied.
Context here is that 99% of the traffic is Internet traffic.
Almost all of the pihole traffic is to forwarders. BTW, the way
pihole works (by default) is it replies 0.0.0.0 for blocked
hosts. It should respond to every query.
I tried the live_pcap_download.html
<https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html>
lua, but couldn't figure out the bpf_filter:
curl --cookie "user=admin; password=xxxxx"
"http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=\
<http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=%5C>"port
53\""
I also tried the download pcap on the if_stats.lua page. The
downloaded pcap file seems to only contain incoming data (see
wireshark)?
This is consistent with the above alerts, please ensure that
ntopng is not dropping packets as this would explain this behavior.
If I just do a tshark on the same interface that ntopng is
listening on, I see all of the expected DNS query & replies. I
am not able to correlate the alerts to any missing packets.
See response above.
Regards,
Emanuele
On Fri, May 8, 2020 at 2:53 AM Emanuele Faranda <fara...@ntop.org
<mailto:fara...@ntop.org>> wrote:
Hi Aaron,
The alerts that you are reporting basically tell you that
such hosts receive DNS requests but do not send a reply. In
order to troubleshoot possible problems you should augment
such information with the knowledge of your network.
The first question to answer is, are that hosts expected to
accept DNS requests? If not, are the requests generated from
the internet or from the LAN? In the first case a firewall to
block such DNS requests may be a good idea . In the latter
case some hosts in the LAN may be misconfigured. In case of
the pihole hosts, I expect pihole to block some DNS requests
for advertisement sites so this could be a normal behaviour.
The following ntopng features may also help you:
https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html
https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html
https://www.ntop.org/guides/ntopng/historical_flows.html
Regards,
Emanuele
On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
Hello,
I'm trying to understand how/why I am getting the "Replies /
Requests Ratio" warnings for DNS.
I am suspect of these alerts, and would like to know how/why
they are being generated. I am suspect for for the
following reasons: 1) If it really is as bad as indicated,
I should notice problems. 2) the "events' occur immediately
after I clear the alerts, and tend to persist for hours.
In any case, I cleared the alerts last night, and this is
what they look like:
06/05/2020 22:15:00 12:31:28 Warning Replies /
Requests
Ratio Host edgemax.example.net
<http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
has received 54 DNS requests but sent 0 DNS replies [5
Minutes ratio: 0%]
06/05/2020 22:15:00 12:31:28 Warning Replies /
Requests
Ratio Host pihole.example.net
<http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
has sent 93 DNS requests but received 3 DNS replies [5
Minutes ratio: 3.2%]
06/05/2020 22:15:00 12:31:28 Warning Replies /
Requests
Ratio Host pihole-2.example.net
<http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
has sent 97 DNS requests but received 1 DNS reply [5 Minutes
ratio: 1.0%]
_______________________________________________
Ntop mailing list
Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
http://listgateway.unipi.it/mailman/listinfo/ntop
_______________________________________________
Ntop mailing list
Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
http://listgateway.unipi.it/mailman/listinfo/ntop
_______________________________________________
Ntop mailing list
Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
http://listgateway.unipi.it/mailman/listinfo/ntop
_______________________________________________
Ntop mailing list
Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
http://listgateway.unipi.it/mailman/listinfo/ntop
_______________________________________________
Ntop mailing list
Ntop@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop
_______________________________________________
Ntop mailing list
Ntop@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop