Hi Aaron,

The alerts on HTTP traffic should not be linked to the --ignore-vlan option, as adding such option should actually improve the requests vs reply ratio also in case of HTTP so I expect less alerts to be generated than before.

Anyway, please monitor the situation and if you still think that there is such a problem please provide a PCAP file privately with the HTTP traffic so that we can inspect it.

Regards,

Emanuele

On 5/13/20 4:55 PM, Aaron Scamehorn wrote:
Interesting.  I do recall seeing vlan tags on some but not all of the flows in ntopng.

Looking at the pcaps now, I do see that traffic from the 2 pi-hole hosts have vlan tags whereas other hosts have no vlan tag.  So, the switch that the pi-holes is adding vlan tags?

Anyway, I ran the 30 minute pcap file with the --ignore-vlan config, and agree that does resolve the issue with the pcap file.

Adding that config to the "prod" ntopng apparently introduces new problems.  I am now getting Replies / Requests Ratio alerts for HTTP on various hosts.  I have not seen these alerts before.  These do not have the prolonged duration that the DNS alerts were having; rather, these are all of the 5 minute duration.

Could this be a boundary issue?  Could client send the requests in one 5 minute window, and the responses are on the next 5 minute window?

Aaron




On Wed, May 13, 2020 at 8:48 AM Emanuele Faranda <fara...@ntop.org <mailto:fara...@ntop.org>> wrote:

    Aaron,

    Writing to you here to continue the public discussion. The problem
    is that the DNS requests have no VLAN tag whereas the DNS replies
    have the VLAN tag 1. So ntopng splits the DNS flows in two
    monodirectional flows. If you want to ignore the VLAN tag in
    ntopng you can use the --ignore-vlans flag in ntopng. This should
    fix your problem.

    Regards,

    Emanuele

    On 5/13/20 3:06 PM, Emanuele Faranda wrote:

    Hi Aaron,

    Please contact us privately at fara...@ntop.org
    <mailto:fara...@ntop.org> and maina...@ntop.org
    <mailto:maina...@ntop.org> . Please ensure that the PCAP files
    only contain DNS traffic.

    Regards,

    Emanuele

    On 5/12/20 5:13 PM, Aaron Scamehorn wrote:
    Emanuele,

    Here is ntopng.conf
    -G=/var/run/ntopng.pid
    -i=enp2s0
    -m=10.12.17.0/24 <http://10.12.17.0/24>
    -S=local

    I do see unidirectional flows in flows_stats.lua for DNS. 
    Incidentally, I do also see alerts w/ non-zero replies (though
    most alerts are 0):
    Host pihole has sent 211 DNS requests but received 7 DNS replies

    I tried 2 different 30 minute PCAP files.  In both cases, right
    at the 10 minute mark, I got alerts.  How can I get these PCAP
    files to you?

    Thanks,
    Aaron



    On Tue, May 12, 2020 at 4:13 AM Emanuele Faranda
    <fara...@ntop.org <mailto:fara...@ntop.org>> wrote:

        Hi Aaron,

        Please see below.

        On 5/11/20 9:29 PM, Aaron Scamehorn wrote:
        Hi Emanuele,

        Thank you again for the detailed responses.

        From the interfaces page, I see these stats:
        Total Traffic   91.6 GB [103,062,265 Pkts]      Dropped Packets
        0 Pkts

        I don't see any dropped packets on the NIC either:
        ethtool -S enp2s0
        NIC statistics:
             tx_packets: 0
             rx_packets: 106581943
             tx_errors: 0
             rx_errors: 0
             rx_missed: 0
             align_errors: 0
             tx_single_collisions: 0
             tx_multi_collisions: 0
             unicast: 105432876
             broadcast: 350738
             multicast: 1149060
             tx_aborted: 0
             tx_underrun: 0

        As of right now, 2 of the hosts we are discussing are still
        in alert, at the original Date/Time of 07:25:01, and
        Duration is now "3 Days, 08:06:59".

        Given that my replies vs requests ratio is still configured
        at 50%, this means that, at every 5 minute interval for the
        last 3 Days, 8 hours, said host is receiving < 50% DNS
        replies, correct?  I find this difficult to believe, and
        cannot find ANY missing packets in my pcap file.

        I have captured a 30 minute pcap file captured with this
        command:
        tcpdump -i enp2s0 -G 1800 -w /tmp/enp2s0.%FT%T.pcap host
        edgemax and port 53

        This file contains DNS traffic to/from edgemax only.
        I can count responses like this:
        tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c
        "Standard query response"
        349
        And queries like this:
        tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c
        "Standard query 0x"
        349

        In other words, no missing DNS responses in the 30 minutes
        spanning 13:00:02 to 13:29:51.

        I would think that the alert should "clear" because the
        threshold is not exceeded within that 30 minute pcap file.

        In any case, at 13:23, I manually click on the "Release"
        button for that alert.  2 minutes later, at 13:25:00, I
        receive this alert:
        Host edgemax has received 62 DNS requests but sent 0 DNS
        replies [5 Minutes ratio: 0%]

        As stated previously, no missing DNS responses in the 30
        minutes spanning 13:00:02 to 13:29:51.  Why does ntopng
        think 62 replies are missing?

        Please report your ntopng.conf. If you look at the active
        ntopng DNS flows, can you identify unidirectional flows? You
        can also try to run ntopng on the PCAP file
        (--original-speed -i file.pcap). If you can reproduce using
        the PCAP file, please send it to me privately so that I can
        troubleshoot the problem.


        I exported 10 minutes of PCAP from if_stats.lua.  Using the
        filter "(ip.dst_host == "10.12.17.1" or ip.src_host ==
        "10.12.17.1") and dns" I am not able to find any missing
        DNS responses in wireshark.  Interestingly, If I specify a
        BPF Filter ("port 53"), the downloaded PCAP file seems to
        only have 1 side (ie. edgemax is only a source, never a
        dest. Without a BPF Filter, the download is fine.

        This is probably a bug, please open an issue at
        https://github.com/ntop/ntopng .

        Regards,

        Emanuele





        On Mon, May 11, 2020 at 8:59 AM Emanuele Faranda
        <fara...@ntop.org <mailto:fara...@ntop.org>> wrote:

            Hi Aaron,

            Please see below:

            On 5/8/20 10:27 PM, Aaron Scamehorn wrote:
            Thank you for your response.  In the screenshot below,
            can you please explain the significance of the
            "Date/Time" and the "Duration" columns?  What do they
            mean in this context?

            Date/Time: the time when the alert was triggered.
            Ntopng performs periodic checks in order to trigger
            alerts. In this particular case, the check on the
            requests/reply ratio is performed every 5 minutes. So
            this means that problem started between 07:20 and 07:25 .

            Duration: the total time in which the problem was
            active. Again, the check is performed every 5 minutes
            for this alert so 5 minutes is the granularity.


            Do I understand correctly that all 3 hosts triggered
            the alert at 07:25:01 (OR 07:30:01) this morning?  And
            that all three alerts are active for the past 07:28:53
            hours?   Does this mean that there have been no new
            additional DNS Reply/Request issues have been detected?
            As explained above, the problem started between 07:20
            and 07:25 . For 07:28:53 hours the problem was active
            on all the three hosts (the requests/reply ratio
            threshold was exceeded for 07:28:53 hours).

            I notice in "Past Alerts" tab, that there are many
            Reply/Request Alerts for the same host with very short
            durations (screen shot #2). When/how does an alert
            move from the "Engaged" to "Past" tab?
            In this case, the engaged alert becomes "past" alert
            when, after the check performed every 5 minutes, the
            requests/reply ratio threshold is not exceed anymore.
            This can happen as soon as the next check is performed
            (5 minutes).

            So in the 2nd screenshot, fire-TV had an alert at
            06:20:00 for 05:00 minutes where 18 requests received
            0 replies.  Then another alert at 06:50:00 for 05:00
            minutes.  Were the 18 replies from the first alert
            ultimately received?  And they were received 5 minutes
            the alert occurred?

            The check is performed on the DNS packet counters. A
            DNS request cannot take 5 minutes to be replied. The
            fact that the alert was closed after 5/10 minutes could
            be related to one of these events:

            - The host went idle

            - The host did not send enough DNS requests

            - The new DNS requests made by the host were
            successfully replied.


            Context here is that 99% of the traffic is Internet
            traffic.  Almost all of the pihole traffic is to
            forwarders.  BTW, the way pihole works (by default) is
            it replies 0.0.0.0 for blocked hosts.  It should
            respond to every query.

            I tried the live_pcap_download.html
            
<https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html>
            lua, but couldn't figure out the bpf_filter:
            curl --cookie "user=admin; password=xxxxx"
             
"http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=\
            
<http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=%5C>"port
            53\""

            I also tried the download pcap on the if_stats.lua
            page.   The downloaded pcap file seems to only contain
            incoming data (see wireshark)?
            This is consistent with the above alerts, please ensure
            that ntopng is not dropping packets as this would
            explain this behavior.

            If I just do a tshark on the same interface that
            ntopng is listening on, I see all of the expected DNS
            query & replies.  I am not able to correlate the
            alerts to any missing packets.

            See response above.

            Regards,

            Emanuele




            On Fri, May 8, 2020 at 2:53 AM Emanuele Faranda
            <fara...@ntop.org <mailto:fara...@ntop.org>> wrote:

                Hi Aaron,

                The alerts that you are reporting basically tell
                you that such hosts receive DNS requests but do
                not send a reply. In order to troubleshoot
                possible problems you should augment such
                information with the knowledge of your network.

                The first question to answer is, are that hosts
                expected to accept DNS requests? If not, are the
                requests generated from the internet or from the
                LAN? In the first case a firewall to block such
                DNS requests may be a good idea . In the latter
                case some hosts in the LAN may be misconfigured.
                In case of the pihole hosts, I expect pihole to
                block some DNS requests for advertisement sites so
                this could be a normal behaviour. The following
                ntopng features may also help you:

                
https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html

                
https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html

                https://www.ntop.org/guides/ntopng/historical_flows.html

                Regards,
                Emanuele

                On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
                Hello,

                I'm trying to understand how/why I am getting the
                "Replies / Requests Ratio" warnings for DNS.

                I am suspect of these alerts, and would like to
                know how/why they are being generated.  I am
                suspect for for the following reasons: 1) If it
                really is as bad as indicated, I should notice
                problems.  2) the "events' occur immediately
                after I clear the alerts, and tend to persist for
                hours.

                In any case, I cleared the alerts last night, and
                this is what they look like:

                06/05/2020 22:15:00     12:31:28        Warning         Replies 
/
                Requests Ratio          Host edgemax.example.net
                
<http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
                has received 54 DNS requests but sent 0 DNS
                replies [5 Minutes ratio: 0%]   

                06/05/2020 22:15:00     12:31:28        Warning         Replies 
/
                Requests Ratio          Host pihole.example.net
                
<http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
                has sent 93 DNS requests but received 3 DNS
                replies [5 Minutes ratio: 3.2%]         
                06/05/2020 22:15:00     12:31:28        Warning         Replies 
/
                Requests Ratio          Host pihole-2.example.net
                
<http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
                has sent 97 DNS requests but received 1 DNS reply
                [5 Minutes ratio: 1.0%]
                        



                _______________________________________________
                Ntop mailing list
                Ntop@listgateway.unipi.it  <mailto:Ntop@listgateway.unipi.it>
                http://listgateway.unipi.it/mailman/listinfo/ntop
                _______________________________________________
                Ntop mailing list
                Ntop@listgateway.unipi.it
                <mailto:Ntop@listgateway.unipi.it>
                http://listgateway.unipi.it/mailman/listinfo/ntop


            _______________________________________________
            Ntop mailing list
            Ntop@listgateway.unipi.it  <mailto:Ntop@listgateway.unipi.it>
            http://listgateway.unipi.it/mailman/listinfo/ntop
            _______________________________________________
            Ntop mailing list
            Ntop@listgateway.unipi.it
            <mailto:Ntop@listgateway.unipi.it>
            http://listgateway.unipi.it/mailman/listinfo/ntop


        _______________________________________________
        Ntop mailing list
        Ntop@listgateway.unipi.it  <mailto:Ntop@listgateway.unipi.it>
        http://listgateway.unipi.it/mailman/listinfo/ntop
        _______________________________________________
        Ntop mailing list
        Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
        http://listgateway.unipi.it/mailman/listinfo/ntop


    _______________________________________________
    Ntop mailing list
    Ntop@listgateway.unipi.it  <mailto:Ntop@listgateway.unipi.it>
    http://listgateway.unipi.it/mailman/listinfo/ntop

    _______________________________________________
    Ntop mailing list
    Ntop@listgateway.unipi.it  <mailto:Ntop@listgateway.unipi.it>
    http://listgateway.unipi.it/mailman/listinfo/ntop
    _______________________________________________
    Ntop mailing list
    Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
    http://listgateway.unipi.it/mailman/listinfo/ntop


_______________________________________________
Ntop mailing list
Ntop@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop
_______________________________________________
Ntop mailing list
Ntop@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop

Reply via email to