So,

My latest update;

Theo mentioned the single CPU kernels don't make use of APIC interrupt
controllers, just ISA. I booted my single P4 systems into the bsd.mp
kernel, and behold there's a major difference in speed!

Now the systems no longer claim 95%+ CPU held in interrupts, but claim
to be 100% idle most of the time, bouncing into 1-6% sys CPU every few
seconds, and holding at 0% int CPU. Traffic changed from lossy at 120
megs, to maxed out at 150 megabits, ~70k pps per interface.

At that point traffic very obviously flatlined, but it did not dip or
fail. I saw no visible CPU load, interrupts were around 7.8k/sec per
active NIC. It looked almost like I had set an altq limit of 150
megabits. Any idea on how to profile where my packets are spending
most of their time? I'm not so great with this level of
troubleshooting, but I would love to get better at it.

Right now I have two machines in a semi-carp cluster. A 3.7 stable
box, and a -current as of oct 15th. 3.7 doesn't have the tuner Henning
mentioned, but 3.8 and -current do. Set net.inet.ip.ifq.maxlen=250 on
the -current box and traffic went up to 160 megabits and flatlined
again.

The next thing I'm trying tomorrow morning is switching the internal
interface to one of the bge nics. The systems have two bge nics
built-in, and one PCI-X 133mhz intel dual port 1000MT server nic.
Right now the int/ext are on the intel card and the pfsync int is on
bge1.

-Dormando

On 10/19/05, Henning Brauer <[EMAIL PROTECTED]> wrote:
> eh, this is really only good for benching, because otherwise we stop
> traversing the pf ruleset for very short amounts of time if we are
> about to exhaust CPU. this allows already established connections to
> live on and the OP to log in to the box via console and take
> countermeasures. if you already ahd an ssh sessionto teh box it has
> good chances to survive and you can even take countermeasures over that.
>
> what you really want to do for high speed routers is increasing
>   net.inet.ip.ifq.maxlen
> I currently use 250 on some routers which seems good, but I need to do
> more tests before I can make qualified assumptions about good values.
>
> This is the max length of a queue in the input path, and the default of
> 50 packets is too small for high speed routers with modern GigE cards
> that can put about that into teh queue with one single int. Or even more.
>
> In the end I think we need a better default based on some factors like
> ip forwarding enabled and summarized link speed and RAM in teh box or
> somesuch. Ryan and I discussed that on the ferry earlier this year and
> have some good ideas, now we just need some time to work on it ;(
>
> * Schvberle Daniel <[EMAIL PROTECTED]> [2005-10-18 18:36]:
> > Hi,
> >
> > I was trying to bench routing pps with pf on and henning gave me
> > some advice which I think might help you too. For my benching purposes
> > it helped break the 200k pps barrier with current but no guaranties
> > that it'll do you any good or that it won't hurt you.
> >
> > <quote>
> > The high drop rates
> > are a anti-DDoS measure - yeah, that pretty much makes benching
> > impossible...
> > you could change IF_INPUT_ENQUEUE in sys/net/if.h so that it looks like
> >
> > #define IF_INPUT_ENQUEUE(ifq, m) {                      \
> >         if (IF_QFULL(ifq)) {                            \
> >                 IF_DROP(ifq);                           \
> >                 m_freem(m);                             \
> >         } else                                          \
> >                 IF_ENQUEUE(ifq, m);                     \
> > }
> >
> > i. e. remove these two lines:
> >                 if (!(ifq)->ifq_congestion)             \
> >                         if_congestion(ifq);             \
> >
> > that means the congestion flag will never be set.
> > or you add a return; as first statement in if_congestion() in if.c.
> >
> > <endquote>
> >
> > > -----Original Message-----
> > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> > > On Behalf Of dormando
> > > Sent: Monday, October 17, 2005 8:29 PM
> > > To: misc@openbsd.org
> > > Subject: Very high interrupts on a supermicro machine.
> > >
> > > Hey all,
> > >
> > > Attached is a dmesg of one of a pair of supermicro based firewalls I
> > > recently bought. I had set them up as a CARP/pfsync redundant pair of
> > > frontend firewalls for our network. However, after they reached 15,000
> > > interrupts per second (~ 110 megabits of our site traffic),
> > > they passed 90%
> > > CPU usage through interrupts and stopped being useful.
> > >
> > > The machines have two built-in BGE nics. I swapped in an
> > > Intel PRO/1000MT
> > > Dual Port Server Nic into a PCI-X 133mhz PCI slot, but it
> > > made absolutely no
> > > difference in the interrupt load. The current firewalls in
> > > place are freebsd
> > > machines running on supermicro hardware with two em based
> > > built-in nics
> > > running past 40k interrupts without passing 50% CPU load on
> > > interrupts. The
> > > only error I can see in the dmesg was this:
> > >
> > > pcibios0: no compatible PCI ICU found: ICU vendor 0x8086
> > > product 0x2640
> > > pcibios0: Warning, unable to fix up PCI interrupt routing
> > > pcibios0: PCI bus #5 is the last bus
> > >
> > > ... which as far as I can read, is "harmless", but potentially causing
> > > higher interrupt load?
> > >
> > > Any hints as to where I should look next would be great. I'm about to
> > > install the latest -current snapshot on the machine to see if
> > > there's a
> > > recent fix.
> > >
> > > I'm about 95% sure this is the motherboard we're using:
> > > http://www.supermicro.com/products/motherboard/P4/E7221/P8SCT.
> > > cfm I'll check
> > > with the order guy and confirm the PO.
> > >
> > > There's a 3.4ghz P4 CPU in it, the two built-in nics, and a
> > > single PCI-X
> > > 133mhz PCI port which I used for the dual port server nic
> > > from intel. SATA
> > > harddrive for what it's worth. Running OpenBSD 3.7 as a PF
> > > firewall. I've
> > > tried changing a bunch of BIOS options, disabling interrupts,
> > > etc. I haven't
> > > compiled my own kernel or built the OS or anything.
> > >
> > > Thanks,
> > > -Dormando
> >
>
> --
> BS Web Services, http://www.bsws.de/
> OpenBSD-based Webhosting, Mail Services, Managed Servers, ...
> Unix is very simple, but it takes a genius to understand the simplicity.
> (Dennis Ritchie)

Reply via email to