Hi Aaron,

ntopng did not account the ethernet frame padding which resulted in the ACK packets to be parsed as HTTP replies, so the actual HTTP reply in subsequent packets were ignored. This is fixed in https://github.com/ntop/ntopng/commit/cba3ab2ea6258895fa7270330607f491fa942c47 . A new package will be available in one hour. Please kindly confirm that it work on the live traffic.

Regards,

Emanuele

On 5/22/20 7:11 AM, Aaron Scamehorn wrote:
Hi Emanuele,

It's been about 9 days since adding the ignore_vlan option.  The behavior has definitely changed, however, I continue to get Replies / Requests Ratio alerts.

I get far fewer alerts for DNS.  As mentioned earlier, I am now getting alerts for HTTP.  Over 600 in the last 9 days:

    "msg": "Host edgemax has received 117 HTTP requests but sent 51 HTTP replies [5 Minutes ratio: 43.2%] "     "msg": "Host edgemax has received 117 HTTP requests but sent 51 HTTP replies [5 Minutes ratio: 43.2%] "     "msg": "Host edgemax has received 117 HTTP requests but sent 51 HTTP replies [5 Minutes ratio: 43.2%] "     "msg": "Host edgemax has received 117 HTTP requests but sent 51 HTTP replies [5 Minutes ratio: 43.2%] "     "msg": "Host edgemax has received 118 HTTP requests but sent 51 HTTP replies [5 Minutes ratio: 42.9%] "     "msg": "Host edgemax has received 100 HTTP requests but sent 34 HTTP replies [5 Minutes ratio: 33.7%] "     "msg": "Host edgemax has received 78 HTTP requests but sent 34 HTTP replies [5 Minutes ratio: 43.0%] "

The duration is usually 10 minutes or less.

I've sent you a PCAP file to reproduce.

Aaron


On Fri, May 15, 2020 at 4:26 AM Emanuele Faranda <[email protected] <mailto:[email protected]>> wrote:

    Hi Aaron,

    The alerts on HTTP traffic should not be linked to the
    --ignore-vlan option, as adding such option should actually
    improve the requests vs reply ratio also in case of HTTP so I
    expect less alerts to be generated than before.

    Anyway, please monitor the situation and if you still think that
    there is such a problem please provide a PCAP file privately with
    the HTTP traffic so that we can inspect it.

    Regards,

    Emanuele

    On 5/13/20 4:55 PM, Aaron Scamehorn wrote:
    Interesting.  I do recall seeing vlan tags on some but not all of
    the flows in ntopng.

    Looking at the pcaps now, I do see that traffic from the 2
    pi-hole hosts have vlan tags whereas other hosts have no vlan
    tag.  So, the switch that the pi-holes is adding vlan tags?

    Anyway, I ran the 30 minute pcap file with the --ignore-vlan
    config, and agree that does resolve the issue with the pcap file.

    Adding that config to the "prod" ntopng apparently introduces new
    problems.  I am now getting Replies / Requests Ratio alerts for
    HTTP on various hosts.  I have not seen these alerts before. 
    These do not have the prolonged duration that the DNS alerts were
    having; rather, these are all of the 5 minute duration.

    Could this be a boundary issue?  Could client send the requests
    in one 5 minute window, and the responses are on the next 5
    minute window?

    Aaron




    On Wed, May 13, 2020 at 8:48 AM Emanuele Faranda
    <[email protected] <mailto:[email protected]>> wrote:

        Aaron,

        Writing to you here to continue the public discussion. The
        problem is that the DNS requests have no VLAN tag whereas the
        DNS replies have the VLAN tag 1. So ntopng splits the DNS
        flows in two monodirectional flows. If you want to ignore the
        VLAN tag in ntopng you can use the --ignore-vlans flag in
        ntopng. This should fix your problem.

        Regards,

        Emanuele

        On 5/13/20 3:06 PM, Emanuele Faranda wrote:

        Hi Aaron,

        Please contact us privately at [email protected]
        <mailto:[email protected]> and [email protected]
        <mailto:[email protected]> . Please ensure that the PCAP
        files only contain DNS traffic.

        Regards,

        Emanuele

        On 5/12/20 5:13 PM, Aaron Scamehorn wrote:
        Emanuele,

        Here is ntopng.conf
        -G=/var/run/ntopng.pid
        -i=enp2s0
        -m=10.12.17.0/24 <http://10.12.17.0/24>
        -S=local

        I do see unidirectional flows in flows_stats.lua for DNS. 
        Incidentally, I do also see alerts w/ non-zero replies
        (though most alerts are 0):
        Host pihole has sent 211 DNS requests but received 7 DNS
        replies

        I tried 2 different 30 minute PCAP files.  In both cases,
        right at the 10 minute mark, I got alerts.  How can I get
        these PCAP files to you?

        Thanks,
        Aaron



        On Tue, May 12, 2020 at 4:13 AM Emanuele Faranda
        <[email protected] <mailto:[email protected]>> wrote:

            Hi Aaron,

            Please see below.

            On 5/11/20 9:29 PM, Aaron Scamehorn wrote:
            Hi Emanuele,

            Thank you again for the detailed responses.

            From the interfaces page, I see these stats:
            Total Traffic       91.6 GB [103,062,265 Pkts]      Dropped
            Packets     0 Pkts

            I don't see any dropped packets on the NIC either:
            ethtool -S enp2s0
            NIC statistics:
                 tx_packets: 0
                 rx_packets: 106581943
                 tx_errors: 0
                 rx_errors: 0
                 rx_missed: 0
                 align_errors: 0
                 tx_single_collisions: 0
                 tx_multi_collisions: 0
                 unicast: 105432876
                 broadcast: 350738
                 multicast: 1149060
                 tx_aborted: 0
                 tx_underrun: 0

            As of right now, 2 of the hosts we are discussing are
            still in alert, at the original Date/Time of 07:25:01,
            and Duration is now "3 Days, 08:06:59".

            Given that my replies vs requests ratio is still
            configured at 50%, this means that, at every 5 minute
            interval for the last 3 Days, 8 hours, said host is
            receiving < 50% DNS replies, correct?  I find this
            difficult to believe, and cannot find ANY missing
            packets in my pcap file.

            I have captured a 30 minute pcap file captured with
            this command:
            tcpdump -i enp2s0 -G 1800 -w /tmp/enp2s0.%FT%T.pcap
            host edgemax and port 53

            This file contains DNS traffic to/from edgemax only.
            I can count responses like this:
            tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep
            -c "Standard query response"
            349
            And queries like this:
            tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep
            -c "Standard query 0x"
            349

            In other words, no missing DNS responses in the 30
            minutes spanning 13:00:02 to 13:29:51.

            I would think that the alert should "clear" because
            the threshold is not exceeded within that 30 minute
            pcap file.

            In any case, at 13:23, I manually click on the
            "Release" button for that alert.  2 minutes later, at
            13:25:00, I receive this alert:
            Host edgemax has received 62 DNS requests but sent 0
            DNS replies [5 Minutes ratio: 0%]

            As stated previously, no missing DNS responses in the
            30 minutes spanning 13:00:02 to 13:29:51. Why does
            ntopng think 62 replies are missing?

            Please report your ntopng.conf. If you look at the
            active ntopng DNS flows, can you identify
            unidirectional flows? You can also try to run ntopng on
            the PCAP file (--original-speed -i file.pcap). If you
            can reproduce using the PCAP file, please send it to me
            privately so that I can troubleshoot the problem.


            I exported 10 minutes of PCAP from if_stats.lua. 
            Using the filter "(ip.dst_host == "10.12.17.1" or
            ip.src_host == "10.12.17.1") and dns" I am not able to
            find any missing DNS responses in wireshark.
            Interestingly, If I specify a BPF Filter ("port 53"),
            the downloaded PCAP file seems to only have 1 side
            (ie. edgemax is only a source, never a dest.  Without
            a BPF Filter, the download is fine.

            This is probably a bug, please open an issue at
            https://github.com/ntop/ntopng .

            Regards,

            Emanuele





            On Mon, May 11, 2020 at 8:59 AM Emanuele Faranda
            <[email protected] <mailto:[email protected]>> wrote:

                Hi Aaron,

                Please see below:

                On 5/8/20 10:27 PM, Aaron Scamehorn wrote:
                Thank you for your response.  In the screenshot
                below, can you please explain the significance of
                the "Date/Time" and the "Duration" columns?  What
                do they mean in this context?

                Date/Time: the time when the alert was triggered.
                Ntopng performs periodic checks in order to
                trigger alerts. In this particular case, the check
                on the requests/reply ratio is performed every 5
                minutes. So this means that problem started
                between 07:20 and 07:25 .

                Duration: the total time in which the problem was
                active. Again, the check is performed every 5
                minutes for this alert so 5 minutes is the
                granularity.


                Do I understand correctly that all 3 hosts
                triggered the alert at 07:25:01 (OR 07:30:01)
                this morning?  And that all three alerts are
                active for the past 07:28:53  hours?   Does this
                mean that there have been no new additional DNS
                Reply/Request issues have been detected?
                As explained above, the problem started between
                07:20 and 07:25 . For 07:28:53 hours the problem
                was active on all the three hosts (the
                requests/reply ratio threshold was exceeded for
                07:28:53 hours).

                I notice in "Past Alerts" tab, that there are
                many Reply/Request Alerts for the same host with
                very short durations (screen shot #2). When/how
                does an alert move from the "Engaged" to "Past" tab?
                In this case, the engaged alert becomes "past"
                alert when, after the check performed every 5
                minutes, the requests/reply ratio threshold is not
                exceed anymore. This can happen as soon as the
                next check is performed (5 minutes).

                So in the 2nd screenshot, fire-TV had an alert at
                06:20:00 for 05:00 minutes where 18 requests
                received 0 replies.  Then another alert at
                06:50:00 for 05:00 minutes.  Were the 18 replies
                from the first alert ultimately received?  And
                they were received 5 minutes the alert occurred?

                The check is performed on the DNS packet counters.
                A DNS request cannot take 5 minutes to be replied.
                The fact that the alert was closed after 5/10
                minutes could be related to one of these events:

                - The host went idle

                - The host did not send enough DNS requests

                - The new DNS requests made by the host were
                successfully replied.


                Context here is that 99% of the traffic is
                Internet traffic.  Almost all of the pihole
                traffic is to forwarders.  BTW, the way pihole
                works (by default) is it replies 0.0.0.0 for
                blocked hosts.  It should respond to every query.

                I tried the live_pcap_download.html
                
<https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html>
                lua, but couldn't figure out the bpf_filter:
                curl --cookie "user=admin; password=xxxxx"
                 
"http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=\
                
<http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=%5C>"port
                53\""

                I also tried the download pcap on the
                if_stats.lua page.   The downloaded pcap file
                seems to only contain incoming data (see wireshark)?
                This is consistent with the above alerts, please
                ensure that ntopng is not dropping packets as this
                would explain this behavior.

                If I just do a tshark on the same interface that
                ntopng is listening on, I see all of the expected
                DNS query & replies.  I am not able to correlate
                the alerts to any missing packets.

                See response above.

                Regards,

                Emanuele




                On Fri, May 8, 2020 at 2:53 AM Emanuele Faranda
                <[email protected] <mailto:[email protected]>> wrote:

                    Hi Aaron,

                    The alerts that you are reporting basically
                    tell you that such hosts receive DNS requests
                    but do not send a reply. In order to
                    troubleshoot possible problems you should
                    augment such information with the knowledge
                    of your network.

                    The first question to answer is, are that
                    hosts expected to accept DNS requests? If
                    not, are the requests generated from the
                    internet or from the LAN? In the first case a
                    firewall to block such DNS requests may be a
                    good idea . In the latter case some hosts in
                    the LAN may be misconfigured. In case of the
                    pihole hosts, I expect pihole to block some
                    DNS requests for advertisement sites so this
                    could be a normal behaviour. The following
                    ntopng features may also help you:

                    
https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html

                    
https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html

                    https://www.ntop.org/guides/ntopng/historical_flows.html

                    Regards,
                    Emanuele

                    On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
                    Hello,

                    I'm trying to understand how/why I am
                    getting the "Replies / Requests Ratio"
                    warnings for DNS.

                    I am suspect of these alerts, and would like
                    to know how/why they are being generated.  I
                    am suspect for for the following reasons: 
                    1) If it really is as bad as indicated, I
                    should notice problems.  2) the "events'
                    occur immediately after I clear the alerts,
                    and tend to persist for hours.

                    In any case, I cleared the alerts last
                    night, and this is what they look like:

                    06/05/2020 22:15:00         12:31:28        Warning
                    Replies / Requests Ratio            Host
                    edgemax.example.net
                    
<http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
                    has received 54 DNS requests but sent 0 DNS
                    replies [5 Minutes ratio: 0%]       

                    06/05/2020 22:15:00         12:31:28        Warning
                    Replies / Requests Ratio            Host
                    pihole.example.net
                    
<http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
                    has sent 93 DNS requests but received 3 DNS
                    replies [5 Minutes ratio: 3.2%]     
                    06/05/2020 22:15:00         12:31:28        Warning
                    Replies / Requests Ratio            Host
                    pihole-2.example.net
                    
<http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
                    has sent 97 DNS requests but received 1 DNS
                    reply [5 Minutes ratio: 1.0%]
                        



                    _______________________________________________
                    Ntop mailing list
                    [email protected]  
<mailto:[email protected]>
                    http://listgateway.unipi.it/mailman/listinfo/ntop
                    _______________________________________________
                    Ntop mailing list
                    [email protected]
                    <mailto:[email protected]>
                    http://listgateway.unipi.it/mailman/listinfo/ntop


                _______________________________________________
                Ntop mailing list
                [email protected]  <mailto:[email protected]>
                http://listgateway.unipi.it/mailman/listinfo/ntop
                _______________________________________________
                Ntop mailing list
                [email protected]
                <mailto:[email protected]>
                http://listgateway.unipi.it/mailman/listinfo/ntop


            _______________________________________________
            Ntop mailing list
            [email protected]  <mailto:[email protected]>
            http://listgateway.unipi.it/mailman/listinfo/ntop
            _______________________________________________
            Ntop mailing list
            [email protected]
            <mailto:[email protected]>
            http://listgateway.unipi.it/mailman/listinfo/ntop


        _______________________________________________
        Ntop mailing list
        [email protected]  <mailto:[email protected]>
        http://listgateway.unipi.it/mailman/listinfo/ntop

        _______________________________________________
        Ntop mailing list
        [email protected]  <mailto:[email protected]>
        http://listgateway.unipi.it/mailman/listinfo/ntop
        _______________________________________________
        Ntop mailing list
        [email protected] <mailto:[email protected]>
        http://listgateway.unipi.it/mailman/listinfo/ntop


    _______________________________________________
    Ntop mailing list
    [email protected]  <mailto:[email protected]>
    http://listgateway.unipi.it/mailman/listinfo/ntop
    _______________________________________________
    Ntop mailing list
    [email protected] <mailto:[email protected]>
    http://listgateway.unipi.it/mailman/listinfo/ntop


_______________________________________________
Ntop mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop
_______________________________________________
Ntop mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop

Reply via email to