I have an older AlphaStation 600 5/266 running -current (cvsupped
last week) which is setup as a router between 2 100mb networks. When
the machine is pushed fairly hard (like running a netperf -tUDP_STREAM
-- -m 100 across the router, eg about 10-20k 100byte packets/sec ) the
alpha falls over almost instantly. I have not enabled any NAT or
firewall functionality, just ip forwarding.
It generally crashes in MCLGET down in the ethernet driver's receiver
interrupt handler. The driver doesn't seem to matter -- I've tried
Intel Etherexpress Pro 100Bs and 3Com 3c905C-TX Fast Etherlink XLs. A
typical stack trace looks like this:
fatal kernel trap:
trap entry = 0x2 (memory management fault)
a0 = 0x826417b78f222
a1 = 0x1
a2 = 0x0
pc = 0xfffffc00004b31bc
ra = 0xfffffc00004b315c
curproc = 0
ddbprinttrap from 0xfffffc00004b31bc
ddbprinttrap(0x826417b78f222, 0x1, 0x0, 0x2)
panic: trap
panic
Stopped at Debugger+0x2c: ldq ra,0(sp) <0xfffffe0005ab57d0> <ra=0xff
fffc00005042e0,sp=0xfffffe0005ab57d0>
db> tr
Debugger() at Debugger+0x2c
panic() at panic+0xf4
trap() at trap+0x5cc
xl_newbuf() at xl_newbuf+0x15c
(null)() at 0x4
db> c
this maps to pci/if_xl.c:1654. But the if_xl driver is probably not
at fault, as I can crash just as easily in fxp_add_rfabuf() when using
intel nics.
Before trying the 3com cards, I had been working under the assumption
that it was a problem with the fxp driver. I instrumented the mbuf
routines somewhat (i hate debugging macros) and it seems the bad
access is due to mclfree getting trashed & replaced by a "random" bad
value (0x826417b78f222 in this panic).
This might be a red herring, but I've found that if I run the entire
ip_input path under splnet() (added splnet() around the call to
ip_input() in ipintr().), things get a hell of a lot more stable.
Rather than crashing in a few seconds, it sometimes takes minutes.
And rather than an illegal access, I tend to run out of kernel stack
space ( either a panic("possible stack overflow\n"); in
alpha/alpha/interrupt.c, or I end up in the SRM console after calling
halt from a PC which isn't in the kernel, which smells like an overrun
stack to me). I'm not sure if this is related, or if it is a separate
problem entirely.
Since an x86 (PII@300MHz, 440lx motherboard, kernel built from same
sources) is rock solid under the same workload, I suspect there's
something wrong that is alpha specific, but I'll be damned if I can
figure it out.
My best guess is that it has something to do with the different
interrupt structure on i386 & alpha. As I understand it, the i386 can
mask off particular interrupt sources, but the alpha simply raises &
lowers the ipl with the following levels available
(from alpha/include/alpha_cpu.h):
#define ALPHA_PSL_IPL_0 0x0000 /* all interrupts enabled */
#define ALPHA_PSL_IPL_SOFT 0x0001 /* software ints disabled */
#define ALPHA_PSL_IPL_IO 0x0004 /* I/O dev ints disabled */
#define ALPHA_PSL_IPL_CLOCK 0x0005 /* clock ints disabled */
#define ALPHA_PSL_IPL_HIGH 0x0006 /* all but mchecks disabled */
Can anybody hazard a guess as to what's going on? I've appended dmesg
output & my config file for completeness.
BTW, as long as the load is light, ip forwarding seems to work. I
can't seem to make this happen using 2 100Mb tulips in this box (which
must copy on the input path due to DMA alignment problems, this slows
things down quite a bit, due to the low memory bandwidth of this
machine)
Thanks,
Drew
------------------------------------------------------------------------------
Andrew Gallatin, Sr Systems Programmer http://www.cs.duke.edu/~gallatin
Duke University Email: [EMAIL PROTECTED]
Department of Computer Science Phone: (919) 660-6590
Copyright (c) 1992-1999 The FreeBSD Project.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
FreeBSD 4.0-CURRENT #4: Wed Oct 27 11:35:25 EDT 1999
[EMAIL PROTECTED]:/usr/project/ari_scratch2/gallatin/src/sys/comp
ile/ALPHA
AlphaStation 500 or 600 (KN20AA)
Digital AlphaStation 600 5/266, 266MHz
8192 byte page size, 1 processor.
CPU: EV5 (21164) major=5 minor=0
OSF PAL rev: 0x1000000020116
real memory = 131940352 (128848K bytes)
avail memory = 122200064 (119336K bytes)
Preloaded elf kernel "kernel" at 0xfffffc0000674000.
cia0: ALCOR/ALCOR2, pass 2
pcib0: <2117x PCI host bus adapter> on cia0
pci0: <PCI bus> on pcib0
xl0: <3Com 3c905C-TX Fast Etherlink XL> irq 8 at device 7.0 on pci0
xl0: interrupting at CIA irq 8
xl0: Ethernet address: 00:50:da:09:3e:41
miibus0: <MII bus> on xl0
xlphy0: <3c905C 10/100 internal PHY> on miibus0
xlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
pcib1: <DEC 21050 PCI-PCI bridge> at device 8.0 on pci0
pci1: <PCI bus> on pcib1
de0: <Digital 21040 Ethernet> irq 16 at device 0.0 on pci1
de0: interrupting at CIA irq 16
de0: DEC 21040 [10Mb/s] pass 2.3
de0: address 08:00:2b:e7:e6:d6
isp0: <Qlogic ISP 1020/1040 PCI SCSI Adapter> irq 17 at device 1.0 on pci1
isp0: interrupting at CIA irq 17
isp0: invalid NVRAM header (aa,aa,aa,aa)
isp0: isp_mboxcmd sees mailbox int with 0x0 in mbox0
isp0: isp_mboxcmd sees mailbox int with 0x0 in mbox0
<..>
isp1: <Qlogic ISP 1020/1040 PCI SCSI Adapter> irq 18 at device 2.0 on pci1
isp1: interrupting at CIA irq 18
isp1: isp_mboxcmd sees mailbox int with 0x0 in mbox0
isp1: invalid NVRAM header (55,55,55,55)
isp1: isp_mboxcmd sees mailbox int with 0x0 in mbox0
isp1: isp_mboxcmd sees mailbox int with 0x0 in mbox0
de1: <Digital 21140 Fast Ethernet> irq 12 at device 9.0 on pci0
de1: interrupting at CIA irq 12
de1: DEC DE500-XA 21140 [10-100Mb/s] pass 1.1
de1: address 00:00:f8:00:99:ba
de1: enabling Full Duplex 100baseTX port
isab0: <Intel 82375EB PCI-EISA bridge> at device 10.0 on pci0
isa0: <ISA bus> on isab0
xl1: <3Com 3c905C-TX Fast Etherlink XL> irq 0 at device 11.0 on pci0
xl1: interrupting at CIA irq 0
xl1: Ethernet address: 00:50:da:09:42:41
miibus1: <MII bus> on xl1
xlphy1: <3c905C 10/100 internal PHY> on miibus1
xlphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
xl2: <3Com 3c905C-TX Fast Etherlink XL> irq 4 at device 12.0 on pci0
xl2: interrupting at CIA irq 4
xl2: Ethernet address: 00:50:da:09:3f:e8
miibus2: <MII bus> on xl2
xlphy2: <3c905C 10/100 internal PHY> on miibus2
xlphy2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
mcclock0: <MC146818A real time clock> at port 0x70-0x71 on isa0
sio0 at port 0x3f8-0x3ff irq 4 on isa0
sio0: type 16550A, console
sio0: interrupting at ISA irq 4
sio1 at port 0x2f8-0x2ff irq 3 flags 0x80 on isa0
sio1: type 16550A
sio1: interrupting at ISA irq 3
fdc0: interrupting at ISA irq 6
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
atkbdc0: <keyboard controller (i8042)> at port 0x60-0x6f on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
atkbd0: interrupting at ISA irq 1
struct nfssvc_sock bloated (> 256bytes)
Try reducing NFS_UIDHASHSIZ
struct nfsuid bloated (> 128bytes)
Try unionizing the nu_nickname and nu_flag fields
Timecounter "alpha" frequency 266671691 Hz
Waiting 3 seconds for SCSI devices to settle
isp0: driver initiated bus reset of bus 0
isp1: driver initiated bus reset of bus 0
de0: autosense failed: cable problem?
Creating DISK da0
Creating DISK da1
Creating DISK cd0
da0 at isp0 bus 0 target 0 lun 0
da0: <SEAGATE ST15150W 0023> Fixed Direct Access SCSI-2 device
da0: 20.000MB/s transfers (10.000MHz, offset 12, 16bit), Tagged Queueing Enabled
da0: 4095MB (8388315 512 byte sectors: 255H 63S/T 522C)
da1 at isp0 bus 0 target 1 lun 0
da1: <SEAGATE ST32171W 0484> Fixed Direct Access SCSI-2 device
da1: 20.000MB/s transfers (10.000MHz, offset 12, 16bit), Tagged Queueing Enabled
da1: 2062MB (4223444 512 byte sectors: 255H 63S/T 262C)
cd0 at isp0 bus 0 target 5 lun 0
cd0: <DEC RRD45 (C) DEC 1645> Removable CD-ROM SCSI-2 device
cd0: 4.032MB/s transfers (4.032MHz, offset 12)
cd0: Attempt to query device size failed: NOT READY, Medium not present
#
machine alpha
cpu EV4
cpu EV5
ident ALPHA
maxusers 32
# Platforms supported
options DEC_AXPPCI_33 # UDB, Multia, AXPpci33, Noname
options DEC_EB164 # EB164, PC164, PC164LX, PC164SX
options DEC_EB64PLUS # EB64+, Aspen Alpine, etc
options DEC_2100_A50 # AlphaStation 200, 250, 255, 400
options DEC_KN20AA # AlphaStation 500, 600
options DEC_ST550 # Personal Workstation 433, 500, 600
options DEC_ST6600 # xp1000, dp264, ds20, ds10, family
#options DEC_3000_300 # DEC3000/300* Pelic* family
#options DEC_3000_500 # DEC3000/[4-9]00 Flamingo/Sandpiper
family
options INET #InterNETworking
`options FFS #Berkeley Fast Filesystem
options NFS #Network Filesystem
options MFS #Memory Filesystem
options MFS_ROOT #Memory Filesystem as rootfs
options MSDOSFS #MSDOS Filesystem
options CD9660 #ISO 9660 Filesystem
options CD9660_ROOT #CD-ROM usable as root device
options FFS_ROOT #FFS usable as root device [keep this!]
options NFS_ROOT #NFS usable as root device
options PROCFS #Process filesystem
options COMPAT_43 #Compatible with BSD 4.3 [KEEP THIS!]
options SCSI_DELAY=3000 #Be pessimistic about Joe SCSI device
options UCONSOLE #Allow users to grab the console
options SOFTUPDATES
# Standard busses
controller pci0
controller isa0
# A single entry for any of these controllers (ncr, ahb, ahc, amd) is
# sufficient for any number of installed devices.
controller ncr0
controller isp0
controller ahc0
#controller esp0
controller scbus0
device da0
device sa0
device pass0
device cd0
#
# ATA and ATAPI devices
# This is work in progress, use at your own risk.
# It currently reuses the majors of wd.c and friends.
# It cannot co-exist with the old system in one kernel.
# You only need one "controller ata0" for it to find all
# PCI devices on modern machines.
controller ata0
device atadisk0 # ATA disk drives
device atapicd0 # ATAPI CDROM drives
device atapifd0 # ATAPI floppy drives
device atapist0 # ATAPI tape drives
# real time clock
device mcclock0 at isa0 port 0x70
controller fdc0 at isa? port IO_FD1 irq 6 drq 2
disk fd0 at fdc0 drive 0
controller atkbdc0 at isa? port IO_KBD
device atkbd0 at atkbdc? irq 1
device psm0 at atkbdc? irq 12
device vga0 at isa? port ? conflicts
# splash screen/screen saver
pseudo-device splash
# syscons is the default console driver, resembling an SCO console
device sc0 at isa?
device sio0 at isa0 port IO_COM1 irq 4
device sio1 at isa0 port IO_COM2 irq 3 flags 0x80
# MII bus support, required for some 10/100 NICs.
controller miibus0
# Operational PCI Ethernet drivers.
device al0
device ax0
device de0
device dm0
device fxp0
device le0
device mx0
device pn0
device rl0
device sf0
device sis0
device ste0
device tl0
device vr0
device wb0
device xl0
pseudo-device loop
pseudo-device ether
pseudo-device sl 1
pseudo-device ppp 1
pseudo-device tun
pseudo-device pty
pseudo-device bpf 4
# KTRACE enables the system-call tracing facility ktrace(2).
# This adds 4 KB bloat to your kernel, and slightly increases
# the costs of each syscall.
options KTRACE #kernel tracing
# This provides support for System V shared memory and message queues.
#
options SYSVSHM
options SYSVMSG
options SYSVSEM
#
# everything above is essentially GENERIC. customizations below.
#
options DDB
options BREAK_TO_DEBUGGER
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message