Recently one of the FreeBSD machines where I work has been crashing on a
semi-regular basis, once or twice a day. The dmesg for the machine is at
the bottom of this post. These crashes started very recently, less than
a week ago. Before that, the machine had been very reliable (several 100
day uptimes).
The machine used to be running FreeBSD 3.1-STABLE as of mid-April 1999.
Since I know many NFS bugs have been fixed since then, the box was on
Tuesday upgraded to 3.4-STABLE (a completely fresh installation). This,
however, did not fix the panics. I believe the problem to be related to
one of these two PRs:
[1998/06/23] kern/7028 http://www.freebsd.org/cgi/query-pr.cgi?pr=7028
panic in vinvalbuf when appending/looking at tail of NFS file
[2000/03/08] misc/17272 http://www.freebsd.org/cgi/query-pr.cgi?pr=17272
deleting a file that a program has open causes vinvalbuf: flush failed
Basically, it's:
panic: vinvalbuf: flush failed
And appears to be triggered by a 'tail -f' on a growing, very large log
file over NFS. The NFS host on the other end is running Solaris 2.6 on a
sparc. The actual mount is kind of weird; it is indirected through a
different NFS mount off a NetApp through a symlink (the NetApp-mounted
FS is basically a symlink farm with a few real directories). Basically:
netapp:/home on /home
sun:/logs on /sun/logs
/home/logs@ -> /sun/logs
and we are doing 'tail -f /home/logs/largelogfile'
(there are good historical reasons for this setup)
We have made no significant changes to the other machines in this setup,
although the logfile in question has been growing in size over time. We
rotate the logfile on the Sun daily as well. No executable files for the
BSD machine are stored on the Sun.
I have compiled a debug kernel and will provide a traceback and/or dump
to anyone who is interested once it happens again. If I find a way to
reliably reproduce it, I will post that too. For the meantime, are there
any quick patches or other solutions I could use?
Thanks in advance for your time and advice,
Doug
Below is dmesg:
Copyright (c) 1992-1999 FreeBSD Inc.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
FreeBSD 3.4-STABLE #2: Tue Mar 14 23:21:39 CST 2000
doug@xxx:/usr/src/sys/compile/XXX
Timecounter "i8254" frequency 1193182 Hz
Timecounter "TSC" frequency 347664663 Hz
CPU: Pentium II/Xeon/Celeron (347.66-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0x652 Stepping = 2
Features=0x183fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR>
real memory = 536870912 (524288K bytes)
avail memory = 519360512 (507188K bytes)
Preloaded elf kernel "kernel" at 0xc0309000.
Preloaded userconfig_script "/boot/kernel.conf" at 0xc030909c.
Pentium Pro MTRR support enabled
Probing for devices on PCI bus 0:
chip0: <Intel 82443BX host to PCI bridge> rev 0x03 on pci0.0.0
chip1: <Intel 82443BX host to AGP bridge> rev 0x03 on pci0.1.0
chip2: <PCI to PCI bridge (vendor=1011 device=0024)> rev 0x03 on pci0.2.0
chip3: <Intel 82371AB PCI to ISA bridge> rev 0x02 on pci0.7.0
chip4: <Intel 82371AB Power management controller> rev 0x02 on pci0.7.3
fxp0: <Intel EtherExpress Pro 10/100B Ethernet> rev 0x05 int a irq 14 on pci0.8.0
fxp0: Ethernet address 00:90:27:45:ee:ae
Probing for devices on PCI bus 1:
vga0: <ATI model 4744 graphics accelerator> rev 0x5c on pci1.0.0
Probing for devices on PCI bus 2:
ahc0: <Adaptec aic7890/91 Ultra2 SCSI adapter> rev 0x00 int a irq 11 on pci2.4.0
ahc0: aic7890/91 Wide Channel A, SCSI Id=7, 16/255 SCBs
ahc1: <Adaptec aic7860 SCSI adapter> rev 0x03 int a irq 11 on pci2.6.0
ahc1: aic7860 Single Channel A, SCSI Id=7, 3/255 SCBs
Probing for devices on the ISA bus:
sc0 on isa
sc0: VGA color <16 virtual consoles, flags=0x0>
atkbdc0 at 0x60-0x6f on motherboard
atkbd0 irq 1 on isa
psm0 not found
sio0 at 0x3f8-0x3ff irq 4 flags 0x30 on isa
sio0: type 16550A, console
sio1 at 0x2f8-0x2ff irq 3 on isa
sio1: type 16550A
fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa
fdc0: FIFO enabled, 8 bytes threshold
fd0: 1.44MB 3.5in
ppc0 at 0x378 irq 7 on isa
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/8 bytes threshold
lpt0: <generic printer> on ppbus 0
lpt0: Interrupt-driven port
ppi0: <generic parallel i/o> on ppbus 0
lppps0: <Pulse per second Timing Interface> on ppbus 0
plip0: <PLIP network interface> on ppbus 0
vga0 at 0x3b0-0x3df maddr 0xa0000 msize 131072 on isa
npx0 on motherboard
npx0: INT 16 interface
Waiting 8 seconds for SCSI devices to settle
chcd0 at ahc1 bus 0 target 5 lun 0
cd0: <NEC CD-ROM DRIVE:465 1.03> Removable CD-ROM SCSI-2 device
cd0: 20.000MB/s transfers (20.000MHz, offset 15)
cd0: Attempt to query device size failed: NOT READY, Medium not present
da1 at ahc0 bus 0 target 1 lun 0
da1: <IBM DDRS-39130D DC1B> Fixed Direct Access SCSI-2 device
da1: 40.000MB/s transfers (20.000MHz, offset 15, 16bit), Tagged Queueing Enabled
da1: 8715MB (17850000 512 byte sectors: 255H 63S/T 1111C)
da0 at ahc0 bus 0 target 0 lun 0
da0: <IBM DDRS-39130D DC1B> Fixed Direct Access SCSI-2 device
da0: 40.000MB/s transfers (20.000MHz, offset 15, 16bit), Tagged Queueing Enabled
da0: 8715MB (17850000 512 byte sectors: 255H 63S/T 1111C)
changing root device to da0s1a
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message