I may be misstating.

Specifically under high burst floods either routed or being dropped by pf we 
would see the system go unresponsive to user-level applications / SSH for 
example.

The system would still function but it was inaccessible. To clarify as well 
this was any number of floods or attacks to any ports, the behavior remained. 
(These were not SSH ports being hit)

Now we did a lot of sysctl resource tuning to correct this with some floods but 
high rate would still cause the behavior. Other times the system would simply 
drop all traffic (like a buffer filled or max connections) but it was not 
either case. 

The attacks were also well within bandwidth capabilities for the pipe and 
network gear.

All of these issues stopped upon adding polling or the overall threshold was 
increased tremendously with polling.

Yet, polling has some downsides not necessarily due to FreeBSD but application 
issues. Haproxy is one example where we had handshake/premature connections 
terminated with polling. Those issues were not present with polling disabled. 

So that is my reasoning for saying that it was perfect for some things and not 
for others.

In the end, we spent years tinkering and it was always satisfactory but never 
perfect. Finally we grew to the point of replacing the edge with MX80's and 
left BSD to load balancing and the like. This finally resolved all issues for 
us.

Albeit, we were a DDoS mitigation company running high PPS and lots of 
bursting. BSD was beautiful until we ended up needing 10Gps+ on the edge and it 
was time to go Juniper.

I still say BSD took us from nothing to a $30M company. So despite something's 
requiring tinkering with I think it is still worth the effort to put in the 
testing to find what is best for your gear and environment.

I got off-track but we did find one other thing. We found ipfw did seem to 
reduce load on the interrupts (likely because we couldn't do near the scrubbing 
with it vs pf) at any rate less filtering may also fix the issue with the op. 

Your forwarding - we found doing forwarding via a simple pf rule and a GRE 
tunnel to an app server or by using a tool like haproxy on the router itself 
seemed to reduce a large majority of our original stability issues (verses pure 
fw-based packet forwarding)

*I also agree because as I mentioned in a previous email... (To me) our overall 
PPS seemed to decrease from FBSD 7 to 9. No idea why but we seemed to begin 
having less effect with polling as we seemed to get with polling on 7.4.

Not to say that this wasn't due to error on our part  or some issue with the 
Juniper switches but we seemed to just run into more issues with newer releases 
when it came to performance with Intel 1Gbps NIC's. this later caused us to 
move more app servers to Linux because we never could get to the bottom of some 
of those things. We do intend to revisit BSD with our new CDN company to see if 
we can restandardize it for high volume traffic servers.

Best,
Kevin 



On Nov 20, 2012, at 7:19 PM, "Adrian Chadd" <adr...@freebsd.org> wrote:

> Ok, so since people are talking about it, and i've been knee deep in
> at least the older intel gige interrupt moderation - at maximum pps,
> how exactly is the interrupt moderation giving you a livelock
> scenario?
> 
> The biggest benefit I found when doing some forwarding work a few
> years ago was to write a little daemon that actually sat there and
> watched the interrupt rates and packet drop rates per-interface - and
> then tuned the interrupt moderation parameters to suit. So at the
> highest pps rates I wasn't swamped with interrupts.
> 
> I think polling here is hiding some poor choices in driver design and
> network stack design..
> 
> 
> 
> adrian
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to