On 21.11.2012 03:16, khatfi...@socllc.net wrote:
I may be misstating.

Specifically under high burst floods either routed or being dropped by pf we 
would see the system
go unresponsive to user-level applications / SSH for example.

The system would still function but it was inaccessible. To clarify as well 
this was any number
of floods or attacks to any ports, the behavior remained. (These were not SSH 
ports being hit)

I'm working on a hybrid interrupt/polling with live-lock prevention
scheme in my svn branch.  It works with a combination of disabling
interrupts in interrupt context and then having an ithread loop over
the RX DMA queue until it reaches the hardware and is done.  Only
then interrupts are re-enabled again.  On a busy system it may never
go back to interrupt.  To prevent live-lock the ithread gives up the
CPU after a normal quantum to let other threads/processes run as well.
After that it gets immediately re-scheduled again with a sufficient
high priority not get starved out by userspace.

With multiple RX queues and MSI-X interrupts as many ithreads as available
cores can be run and none of them will live-lock.  I'm also looking at
using the CoDel algorithm for totally maxed out systems to prevent long
FIFO packet drop chains in the NIC.  Think of it as RED queue management
but for the input queue.  That way we can use distributed single packet
loss as a signalling mechanism for the senders to slow down.  For a
misbehaving sender blasting away this obviously doesn't help much.  It
improves the chance of good packets making it through though.

While live-lock prevention is good you still won't be able to log in
via ssh through an overloaded interface.  Any other interface will
work w/o packet loss instead.

So far I've fully converted fxp(4) to this new scheme because it is one
of the simpler drivers with sufficient documentation.  And 100Mbit is
easy to saturate.

The bge(4) driver is mostly converted but not tested due to lack of
hardware, which should arrive later this week though.

The em(4), and with it due to similarity igb(4) and ixgbe(4) family,
is in the works as well.  Again hardware is on the way for testing.

When this work has stabilized I'm looking for testers to put it through
the paces.  If you're interested and have a suitable test bed then drop
me an email to get notified.

--
Andre

Now we did a lot of sysctl resource tuning to correct this with some floods but 
high rate would
still cause the behavior. Other times the system would simply drop all traffic 
(like a buffer
filled or max connections) but it was not either case.

The attacks were also well within bandwidth capabilities for the pipe and 
network gear.

All of these issues stopped upon adding polling or the overall threshold was 
increased
tremendously with polling.

Yet, polling has some downsides not necessarily due to FreeBSD but application 
issues. Haproxy is
one example where we had handshake/premature connections terminated with 
polling. Those issues
were not present with polling disabled.

So that is my reasoning for saying that it was perfect for some things and not 
for others.

In the end, we spent years tinkering and it was always satisfactory but never 
perfect. Finally we
grew to the point of replacing the edge with MX80's and left BSD to load 
balancing and the like.
This finally resolved all issues for us.

Albeit, we were a DDoS mitigation company running high PPS and lots of 
bursting. BSD was
beautiful until we ended up needing 10Gps+ on the edge and it was time to go 
Juniper.

I still say BSD took us from nothing to a $30M company. So despite something's 
requiring
tinkering with I think it is still worth the effort to put in the testing to 
find what is best
for your gear and environment.

I got off-track but we did find one other thing. We found ipfw did seem to 
reduce load on the
interrupts (likely because we couldn't do near the scrubbing with it vs pf) at 
any rate less
filtering may also fix the issue with the op.

Your forwarding - we found doing forwarding via a simple pf rule and a GRE 
tunnel to an app
server or by using a tool like haproxy on the router itself seemed to reduce a 
large majority of
our original stability issues (verses pure fw-based packet forwarding)

*I also agree because as I mentioned in a previous email... (To me) our overall 
PPS seemed to
decrease from FBSD 7 to 9. No idea why but we seemed to begin having less 
effect with polling as
we seemed to get with polling on 7.4.

Not to say that this wasn't due to error on our part  or some issue with the 
Juniper switches but
we seemed to just run into more issues with newer releases when it came to 
performance with Intel
1Gbps NIC's. this later caused us to move more app servers to Linux because we 
never could get to
the bottom of some of those things. We do intend to revisit BSD with our new 
CDN company to see
if we can restandardize it for high volume traffic servers.

Best, Kevin



On Nov 20, 2012, at 7:19 PM, "Adrian Chadd" <adr...@freebsd.org> wrote:

Ok, so since people are talking about it, and i've been knee deep in at least 
the older intel
gige interrupt moderation - at maximum pps, how exactly is the interrupt 
moderation giving you
a livelock scenario?

The biggest benefit I found when doing some forwarding work a few years ago was 
to write a
little daemon that actually sat there and watched the interrupt rates and 
packet drop rates
per-interface - and then tuned the interrupt moderation parameters to suit. So 
at the highest
pps rates I wasn't swamped with interrupts.

I think polling here is hiding some poor choices in driver design and network 
stack design..



adrian
_______________________________________________ freebsd-net@freebsd.org mailing 
list
http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any 
mail to
"freebsd-net-unsubscr...@freebsd.org"



_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to