Hi Aaron,

Please see below.

On 5/11/20 9:29 PM, Aaron Scamehorn wrote:
Hi Emanuele,

Thank you again for the detailed responses.

From the interfaces page, I see these stats:
Total Traffic   91.6 GB [103,062,265 Pkts]      Dropped Packets         0 Pkts

I don't see any dropped packets on the NIC either:
ethtool -S enp2s0
NIC statistics:
     tx_packets: 0
     rx_packets: 106581943
     tx_errors: 0
     rx_errors: 0
     rx_missed: 0
     align_errors: 0
     tx_single_collisions: 0
     tx_multi_collisions: 0
     unicast: 105432876
     broadcast: 350738
     multicast: 1149060
     tx_aborted: 0
     tx_underrun: 0

As of right now, 2 of the hosts we are discussing are still in alert, at the original Date/Time of 07:25:01, and Duration is now "3 Days, 08:06:59".

Given that my replies vs requests ratio is still configured at 50%, this means that, at every 5 minute interval for the last 3 Days, 8 hours, said host is receiving < 50% DNS replies, correct?  I find this difficult to believe, and cannot find ANY missing packets in my pcap file.

I have captured a 30 minute pcap file captured with this command:
tcpdump -i enp2s0 -G 1800 -w /tmp/enp2s0.%FT%T.pcap host edgemax and port 53

This file contains DNS traffic to/from edgemax only.
I can count responses like this:
tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c "Standard query response"
349
And queries like this:
tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c "Standard query 0x"
349

In other words, no missing DNS responses in the 30 minutes spanning 13:00:02 to 13:29:51.

I would think that the alert should "clear" because the threshold is not exceeded within that 30 minute pcap file.

In any case, at 13:23, I manually click on the "Release" button for that alert.  2 minutes later, at 13:25:00, I receive this alert: Host edgemax has received 62 DNS requests but sent 0 DNS replies [5 Minutes ratio: 0%]

As stated previously, no missing DNS responses in the 30 minutes spanning 13:00:02 to 13:29:51.  Why does ntopng think 62 replies are missing?

Please report your ntopng.conf. If you look at the active ntopng DNS flows, can you identify unidirectional flows? You can also try to run ntopng on the PCAP file (--original-speed -i file.pcap). If you can reproduce using the PCAP file, please send it to me privately so that I can troubleshoot the problem.


I exported 10 minutes of PCAP from if_stats.lua.  Using the filter "(ip.dst_host == "10.12.17.1" or ip.src_host == "10.12.17.1") and dns" I am not able to find any missing DNS responses in wireshark.  Interestingly, If I specify a BPF Filter ("port 53"), the downloaded PCAP file seems to only have 1 side (ie. edgemax is only a source, never a dest. Without a BPF Filter, the download is fine.

This is probably a bug, please open an issue at https://github.com/ntop/ntopng .

Regards,

Emanuele





On Mon, May 11, 2020 at 8:59 AM Emanuele Faranda <fara...@ntop.org <mailto:fara...@ntop.org>> wrote:

    Hi Aaron,

    Please see below:

    On 5/8/20 10:27 PM, Aaron Scamehorn wrote:
    Thank you for your response.  In the screenshot below, can you
    please explain the significance of the "Date/Time" and the
    "Duration" columns?  What do they mean in this context?

    Date/Time: the time when the alert was triggered. Ntopng performs
    periodic checks in order to trigger alerts. In this particular
    case, the check on the requests/reply ratio is performed every 5
    minutes. So this means that problem started between 07:20 and 07:25 .

    Duration: the total time in which the problem was active. Again,
    the check is performed every 5 minutes for this alert so 5 minutes
    is the granularity.


    Do I understand correctly that all 3 hosts triggered the alert at
    07:25:01 (OR 07:30:01) this morning?  And that all three alerts
    are active for the past 07:28:53  hours?   Does this mean that
    there have been no new additional DNS Reply/Request issues have
    been detected?
    As explained above, the problem started between 07:20 and 07:25 .
    For 07:28:53 hours the problem was active on all the three hosts
    (the requests/reply ratio threshold was exceeded for 07:28:53 hours).

    I notice in "Past Alerts" tab, that there are many Reply/Request
    Alerts for the same host with very short durations (screen shot
    #2).  When/how does an alert move from the "Engaged" to "Past" tab?
    In this case, the engaged alert becomes "past" alert when, after
    the check performed every 5 minutes, the requests/reply ratio
    threshold is not exceed anymore. This can happen as soon as the
    next check is performed (5 minutes).

    So in the 2nd screenshot, fire-TV had an alert at 06:20:00 for
    05:00 minutes where 18 requests received 0 replies.  Then another
    alert at 06:50:00 for 05:00 minutes.  Were the 18 replies from
    the first alert ultimately received?  And they were received 5
    minutes the alert occurred?

    The check is performed on the DNS packet counters. A DNS request
    cannot take 5 minutes to be replied. The fact that the alert was
    closed after 5/10 minutes could be related to one of these events:

    - The host went idle

    - The host did not send enough DNS requests

    - The new DNS requests made by the host were successfully replied.


    Context here is that 99% of the traffic is Internet traffic. 
    Almost all of the pihole traffic is to forwarders.  BTW, the way
    pihole works (by default) is it replies 0.0.0.0 for blocked
    hosts.  It should respond to every query.

    I tried the live_pcap_download.html
    
<https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html>
    lua, but couldn't figure out the bpf_filter:
    curl --cookie "user=admin; password=xxxxx"
     
"http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=\
    
<http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=%5C>"port
    53\""

    I also tried the download pcap on the if_stats.lua page.   The
    downloaded pcap file seems to only contain incoming data (see
    wireshark)?
    This is consistent with the above alerts, please ensure that
    ntopng is not dropping packets as this would explain this behavior.

    If I just do a tshark on the same interface that ntopng is
    listening on, I see all of the expected DNS query & replies.  I
    am not able to correlate the alerts to any missing packets.

    See response above.

    Regards,

    Emanuele




    On Fri, May 8, 2020 at 2:53 AM Emanuele Faranda <fara...@ntop.org
    <mailto:fara...@ntop.org>> wrote:

        Hi Aaron,

        The alerts that you are reporting basically tell you that
        such hosts receive DNS requests but do not send a reply. In
        order to troubleshoot possible problems you should augment
        such information with the knowledge of your network.

        The first question to answer is, are that hosts expected to
        accept DNS requests? If not, are the requests generated from
        the internet or from the LAN? In the first case a firewall to
        block such DNS requests may be a good idea . In the latter
        case some hosts in the LAN may be misconfigured. In case of
        the pihole hosts, I expect pihole to block some DNS requests
        for advertisement sites so this could be a normal behaviour.
        The following ntopng features may also help you:

        
https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html

        https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html

        https://www.ntop.org/guides/ntopng/historical_flows.html

        Regards,
        Emanuele

        On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
        Hello,

        I'm trying to understand how/why I am getting the "Replies /
        Requests Ratio" warnings for DNS.

        I am suspect of these alerts, and would like to know how/why
        they are being generated.  I am suspect for for the
        following reasons:  1) If it really is as bad as indicated,
        I should notice problems.  2) the "events' occur immediately
        after I clear the alerts, and tend to persist for hours.

        In any case, I cleared the alerts last night, and this is
        what they look like:

        06/05/2020 22:15:00     12:31:28        Warning         Replies / 
Requests
        Ratio           Host edgemax.example.net
        
<http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
        has received 54 DNS requests but sent 0 DNS replies [5
        Minutes ratio: 0%]      

        06/05/2020 22:15:00     12:31:28        Warning         Replies / 
Requests
        Ratio           Host pihole.example.net
        
<http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
        has sent 93 DNS requests but received 3 DNS replies [5
        Minutes ratio: 3.2%]    
        06/05/2020 22:15:00     12:31:28        Warning         Replies / 
Requests
        Ratio           Host pihole-2.example.net
        
<http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
        has sent 97 DNS requests but received 1 DNS reply [5 Minutes
        ratio: 1.0%]
                



        _______________________________________________
        Ntop mailing list
        Ntop@listgateway.unipi.it  <mailto:Ntop@listgateway.unipi.it>
        http://listgateway.unipi.it/mailman/listinfo/ntop
        _______________________________________________
        Ntop mailing list
        Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
        http://listgateway.unipi.it/mailman/listinfo/ntop


    _______________________________________________
    Ntop mailing list
    Ntop@listgateway.unipi.it  <mailto:Ntop@listgateway.unipi.it>
    http://listgateway.unipi.it/mailman/listinfo/ntop
    _______________________________________________
    Ntop mailing list
    Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
    http://listgateway.unipi.it/mailman/listinfo/ntop


_______________________________________________
Ntop mailing list
Ntop@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop
_______________________________________________
Ntop mailing list
Ntop@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop

Reply via email to