On 6/19/15, 5:07 AM, "bind-users-boun...@lists.isc.org on behalf of Matus UHLAR - fantomas" <bind-users-boun...@lists.isc.org on behalf of uh...@fantomas.sk> wrote:
>>On 6/18/15, 7:09 PM, "Stuart Browne" <stuart.bro...@bomboratech.com.au> >>wrote: >>>Just wondering. You mention you're using RHEL6; are you also getting >>>messages in 'dmesg' about connection tracking tables being full? You >>>may >>>need some 'NOTRACK' rules in your iptables. > >On 18.06.15 23:11, Mike Hoskins (michoski) wrote: >>Just following along, for the record... On our side, iptables is >>completely disabled. We do that sort of thing upstream on dedicated >>firewalls. Just now getting time to reply to Cathy...more detail on that >>there. > >aren't those firewalls overloaded? Originally we found an older set that was, and replaced those... but currently no mix of metrics, logs, packet traces, etc imply this is the case for the current network infra components I have access to. Being completely transparent here, because it's something everyone should carefully consider...but certainly not always the culprit. More than overloading, the larger issue I've worked through (repeatedly) over the years are various "protocol fixups", "ALGs" and the like which try to "secure" you but really break standard things like EDNS. After back/forth with our network team I've reached a state of nirvana where all that stuff is disabled and external tests like OARC are happy. I suppose the only way to avoid any "intermediate" firewalls would be to place everything you run on a LAN segment hanging directly off your router/Internet drop with host based firewalls. I've used iptables, pf, etc a lot over the years but always considered host based firewalls an add-on (layers of security) vs supplement for other types of filtering...even if I placed the caches in such a segment, I'd have clients talking through various firewalls (quite a few of them) so it's not easy to avoid in any sort of large org -- particularly those with various business units acquired and bolted on over time. The original post asked if this was some sort of limit on BIND's capability...almost certainly not, and the way to validate that is lab testing. I've done that using resperf and nominum's query file. It would be great to have two query files, one with known responsive and one with known aberrant zones. This would be difficult to maintain of course... but what I've seen with the default query file (a mix of good and bad from what I verified) you can push BIND much further than the reported qps earlier in this post or in our production environments. In the real world vs lab, there are obviously a lot more variables. Some of these we can eliminate (like the overloaded firewall or broken fixups), others we can tune (our own named.conf), but some we must live with... I'm just trying to get more confidence what's observed is really the last case. :-) I'm most likely being too OCD here, because after all the tuning we've got servfails down to a fraction of a percent over any given time interval. I've been distracted with other things recently, but need to dig into the logs and see if these are really just unresponsive or broken upstream servers. _______________________________________________ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users