I have an older AlphaStation 600 5/266 running -current (cvsupped
last week) which is setup as a router between 2 100mb networks.  When
the machine is pushed fairly hard (like running a netperf -tUDP_STREAM
-- -m 100 across the router, eg about 10-20k 100byte packets/sec ) the
alpha falls over almost instantly.  I have not enabled any NAT or
firewall functionality, just ip forwarding.

It generally crashes in MCLGET down in the ethernet driver's receiver
interrupt handler.  The driver doesn't seem to matter -- I've tried
Intel Etherexpress Pro 100Bs and 3Com 3c905C-TX Fast Etherlink XLs.  A
typical stack trace looks like this:

fatal kernel trap:

    trap entry = 0x2 (memory management fault)
    a0         = 0x826417b78f222
    a1         = 0x1
    a2         = 0x0
    pc         = 0xfffffc00004b31bc
    ra         = 0xfffffc00004b315c
    curproc    = 0

ddbprinttrap from 0xfffffc00004b31bc
ddbprinttrap(0x826417b78f222, 0x1, 0x0, 0x2)
panic: trap
panic
Stopped at      Debugger+0x2c:  ldq     ra,0(sp) <0xfffffe0005ab57d0>   <ra=0xff
fffc00005042e0,sp=0xfffffe0005ab57d0>
db> tr
Debugger() at Debugger+0x2c
panic() at panic+0xf4
trap() at trap+0x5cc
xl_newbuf() at xl_newbuf+0x15c
(null)() at 0x4
db> c

this maps to pci/if_xl.c:1654.  But the if_xl driver is probably not
at fault, as I can crash just as easily in fxp_add_rfabuf() when using
intel nics.

Before trying the 3com cards, I had been working under the assumption
that it was a problem with the fxp driver.  I instrumented the mbuf
routines somewhat (i hate debugging macros) and it seems the bad
access is due to mclfree getting trashed & replaced by a "random" bad
value (0x826417b78f222 in this panic).

This might be a red herring, but I've found that if I run the entire
ip_input path under splnet() (added splnet() around the call to
ip_input() in ipintr().), things get a hell of a lot more stable.
Rather than crashing in a few seconds, it sometimes takes minutes.
And rather than an illegal access, I tend to run out of kernel stack
space ( either a panic("possible stack overflow\n"); in
alpha/alpha/interrupt.c, or I end up in the SRM console after calling
halt from a PC which isn't in the kernel, which smells like an overrun
stack to me).  I'm not sure if this is related, or if it is a separate
problem entirely.

Since an x86 (PII@300MHz, 440lx motherboard, kernel built from same
sources) is rock solid under the same workload, I suspect there's
something wrong that is alpha specific, but I'll be damned if I can
figure it out.

My best guess is that it has something to do with the different
interrupt structure on i386 & alpha.  As I understand it, the i386 can
mask off particular interrupt sources, but the alpha simply raises &
lowers the ipl with the following levels available
(from alpha/include/alpha_cpu.h):

#define ALPHA_PSL_IPL_0         0x0000          /* all interrupts enabled */
#define ALPHA_PSL_IPL_SOFT      0x0001          /* software ints disabled */
#define ALPHA_PSL_IPL_IO        0x0004          /* I/O dev ints disabled */
#define ALPHA_PSL_IPL_CLOCK     0x0005          /* clock ints disabled */
#define ALPHA_PSL_IPL_HIGH      0x0006          /* all but mchecks disabled */

Can anybody hazard a guess as to what's going on?  I've appended dmesg
output & my config file for completeness.

BTW, as long as the load is light, ip forwarding seems to work.  I
can't seem to make this happen using 2 100Mb tulips in this box (which
must copy on the input path due to DMA alignment problems, this slows
things down quite a bit, due to the low memory bandwidth of this
machine)

Thanks,

Drew
------------------------------------------------------------------------------
Andrew Gallatin, Sr Systems Programmer  http://www.cs.duke.edu/~gallatin
Duke University                         Email: [EMAIL PROTECTED]
Department of Computer Science          Phone: (919) 660-6590


Copyright (c) 1992-1999 The FreeBSD Project.
Copyright (c) 1982, 1986, 1989, 1991, 1993
        The Regents of the University of California. All rights reserved.
FreeBSD 4.0-CURRENT #4: Wed Oct 27 11:35:25 EDT 1999
    [EMAIL PROTECTED]:/usr/project/ari_scratch2/gallatin/src/sys/comp
ile/ALPHA
AlphaStation 500 or 600 (KN20AA)
Digital AlphaStation 600 5/266, 266MHz
8192 byte page size, 1 processor.
CPU: EV5 (21164) major=5 minor=0
OSF PAL rev: 0x1000000020116
real memory  = 131940352 (128848K bytes)
avail memory = 122200064 (119336K bytes)
Preloaded elf kernel "kernel" at 0xfffffc0000674000.
cia0: ALCOR/ALCOR2, pass 2
pcib0: <2117x PCI host bus adapter> on cia0
pci0: <PCI bus> on pcib0
xl0: <3Com 3c905C-TX Fast Etherlink XL> irq 8 at device 7.0 on pci0
xl0: interrupting at CIA irq 8
xl0: Ethernet address: 00:50:da:09:3e:41
miibus0: <MII bus> on xl0
xlphy0: <3c905C 10/100 internal PHY> on miibus0
xlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
pcib1: <DEC 21050 PCI-PCI bridge> at device 8.0 on pci0
pci1: <PCI bus> on pcib1
de0: <Digital 21040 Ethernet> irq 16 at device 0.0 on pci1
de0: interrupting at CIA irq 16
de0: DEC 21040 [10Mb/s] pass 2.3
de0: address 08:00:2b:e7:e6:d6
isp0: <Qlogic ISP 1020/1040 PCI SCSI Adapter> irq 17 at device 1.0 on pci1
isp0: interrupting at CIA irq 17
isp0: invalid NVRAM header (aa,aa,aa,aa)
isp0: isp_mboxcmd sees mailbox int with 0x0 in mbox0
isp0: isp_mboxcmd sees mailbox int with 0x0 in mbox0
<..>
isp1: <Qlogic ISP 1020/1040 PCI SCSI Adapter> irq 18 at device 2.0 on pci1
isp1: interrupting at CIA irq 18
isp1: isp_mboxcmd sees mailbox int with 0x0 in mbox0
isp1: invalid NVRAM header (55,55,55,55)
isp1: isp_mboxcmd sees mailbox int with 0x0 in mbox0
isp1: isp_mboxcmd sees mailbox int with 0x0 in mbox0
de1: <Digital 21140 Fast Ethernet> irq 12 at device 9.0 on pci0
de1: interrupting at CIA irq 12
de1: DEC DE500-XA 21140 [10-100Mb/s] pass 1.1
de1: address 00:00:f8:00:99:ba
de1: enabling Full Duplex 100baseTX port
isab0: <Intel 82375EB PCI-EISA bridge> at device 10.0 on pci0
isa0: <ISA bus> on isab0
xl1: <3Com 3c905C-TX Fast Etherlink XL> irq 0 at device 11.0 on pci0
xl1: interrupting at CIA irq 0
xl1: Ethernet address: 00:50:da:09:42:41
miibus1: <MII bus> on xl1
xlphy1: <3c905C 10/100 internal PHY> on miibus1
xlphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
xl2: <3Com 3c905C-TX Fast Etherlink XL> irq 4 at device 12.0 on pci0
xl2: interrupting at CIA irq 4
xl2: Ethernet address: 00:50:da:09:3f:e8
miibus2: <MII bus> on xl2
xlphy2: <3c905C 10/100 internal PHY> on miibus2
xlphy2:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
mcclock0: <MC146818A real time clock> at port 0x70-0x71 on isa0
sio0 at port 0x3f8-0x3ff irq 4 on isa0
sio0: type 16550A, console
sio0: interrupting at ISA irq 4
sio1 at port 0x2f8-0x2ff irq 3 flags 0x80 on isa0
sio1: type 16550A
sio1: interrupting at ISA irq 3
fdc0: interrupting at ISA irq 6
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
atkbdc0: <keyboard controller (i8042)> at port 0x60-0x6f on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
atkbd0: interrupting at ISA irq 1
struct nfssvc_sock bloated (> 256bytes)
Try reducing NFS_UIDHASHSIZ
struct nfsuid bloated (> 128bytes)
Try unionizing the nu_nickname and nu_flag fields
Timecounter "alpha"  frequency 266671691 Hz
Waiting 3 seconds for SCSI devices to settle
isp0: driver initiated bus reset of bus 0
isp1: driver initiated bus reset of bus 0
de0: autosense failed: cable problem?
Creating DISK da0
Creating DISK da1
Creating DISK cd0
da0 at isp0 bus 0 target 0 lun 0
da0: <SEAGATE ST15150W 0023> Fixed Direct Access SCSI-2 device 
da0: 20.000MB/s transfers (10.000MHz, offset 12, 16bit), Tagged Queueing Enabled
da0: 4095MB (8388315 512 byte sectors: 255H 63S/T 522C)
da1 at isp0 bus 0 target 1 lun 0
da1: <SEAGATE ST32171W 0484> Fixed Direct Access SCSI-2 device 
da1: 20.000MB/s transfers (10.000MHz, offset 12, 16bit), Tagged Queueing Enabled
da1: 2062MB (4223444 512 byte sectors: 255H 63S/T 262C)
cd0 at isp0 bus 0 target 5 lun 0
cd0: <DEC RRD45   (C) DEC 1645> Removable CD-ROM SCSI-2 device 
cd0: 4.032MB/s transfers (4.032MHz, offset 12)
cd0: Attempt to query device size failed: NOT READY, Medium not present

#
machine         alpha
cpu             EV4
cpu             EV5
ident           ALPHA
maxusers        32

# Platforms supported
options         DEC_AXPPCI_33           # UDB, Multia, AXPpci33, Noname
options         DEC_EB164               # EB164, PC164, PC164LX, PC164SX
options         DEC_EB64PLUS            # EB64+, Aspen Alpine, etc
options         DEC_2100_A50            # AlphaStation 200, 250, 255, 400
options         DEC_KN20AA              # AlphaStation 500, 600
options         DEC_ST550               # Personal Workstation 433, 500, 600
options         DEC_ST6600              # xp1000, dp264, ds20, ds10, family
#options                DEC_3000_300            # DEC3000/300* Pelic* family
#options                DEC_3000_500            # DEC3000/[4-9]00 Flamingo/Sandpiper 
family

options         INET                    #InterNETworking
`options                FFS                     #Berkeley Fast Filesystem
options         NFS                     #Network Filesystem
options         MFS                     #Memory Filesystem
options         MFS_ROOT                #Memory Filesystem as rootfs
options         MSDOSFS                 #MSDOS Filesystem
options         CD9660                  #ISO 9660 Filesystem
options         CD9660_ROOT             #CD-ROM usable as root device
options         FFS_ROOT                #FFS usable as root device [keep this!]
options         NFS_ROOT                #NFS usable as root device
options         PROCFS                  #Process filesystem
options         COMPAT_43               #Compatible with BSD 4.3 [KEEP THIS!]
options         SCSI_DELAY=3000 #Be pessimistic about Joe SCSI device
options         UCONSOLE                #Allow users to grab the console
options         SOFTUPDATES

# Standard busses
controller      pci0
controller      isa0

# A single entry for any of these controllers (ncr, ahb, ahc, amd) is
# sufficient for any number of installed devices.
controller      ncr0
controller      isp0
controller      ahc0
#controller     esp0

controller      scbus0

device          da0
device          sa0
device          pass0
device          cd0

#
# ATA and ATAPI devices
# This is work in progress, use at your own risk.
# It currently reuses the majors of wd.c and friends.
# It cannot co-exist with the old system in one kernel.
# You only need one "controller ata0" for it to find all
# PCI devices on modern machines.
controller      ata0
device          atadisk0        # ATA disk drives
device          atapicd0        # ATAPI CDROM drives
device          atapifd0        # ATAPI floppy drives
device          atapist0        # ATAPI tape drives

# real time clock
device          mcclock0 at isa0 port 0x70

controller      fdc0    at isa? port IO_FD1 irq 6 drq 2
disk            fd0     at fdc0 drive 0

controller      atkbdc0 at isa? port IO_KBD
device          atkbd0  at atkbdc? irq 1
device          psm0    at atkbdc? irq 12

device          vga0    at isa? port ? conflicts

# splash screen/screen saver
pseudo-device   splash

# syscons is the default console driver, resembling an SCO console
device          sc0     at isa?

device          sio0    at isa0 port IO_COM1 irq 4
device          sio1    at isa0 port IO_COM2 irq 3 flags 0x80

# MII bus support, required for some 10/100 NICs.
controller miibus0

# Operational PCI Ethernet drivers.
device al0
device ax0
device de0
device dm0
device fxp0
device le0
device mx0
device pn0
device rl0
device sf0
device sis0
device ste0
device tl0
device vr0
device wb0
device xl0

pseudo-device   loop
pseudo-device   ether
pseudo-device   sl      1
pseudo-device   ppp     1
pseudo-device   tun
pseudo-device   pty
pseudo-device   bpf     4

# KTRACE enables the system-call tracing facility ktrace(2).
# This adds 4 KB bloat to your kernel, and slightly increases
# the costs of each syscall.
options         KTRACE          #kernel tracing

# This provides support for System V shared memory and message queues.
#
options         SYSVSHM
options         SYSVMSG
options         SYSVSEM

#
# everything above is essentially GENERIC.  customizations below.
#

options         DDB
options         BREAK_TO_DEBUGGER



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Reply via email to