7.2-p4: panic: ufsdirhash_lookup: bad offset in hash array

2010-02-26 Thread Charles Sprickman
I have a box that has paniced two nights in a row with this error.  I have 
a corefile from last night, but tonight's failed:


Uptime: 23h55m22s
Physical memory: 6130 MB
Dumping 759 MB: 744 728 712 696 680 664 648
** DUMP FAILED (ERROR 16) **

Here's some info from the core I do have:

#0  doadump () at pcpu.h:195
195 __asm __volatile("movq %%gs:0,%0" : "=r" (td));
(kgdb) where
#0  doadump () at pcpu.h:195
#1  0x0004 in ?? ()
#2  0x8034c799 in boot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:418
#3  0x8034cba2 in panic (fmt=0x104 )
at /usr/src/sys/kern/kern_shutdown.c:574
#4  0x8052545f in ufsdirhash_lookup (ip=0xff0012530398,
name=0xff012788b000 
"1266473205.M123372P75411V005BI08EE5F75_0.xena.bway.net,S=43650:2,S", 
namelen=70, offp=0x285f474c,

bpp=0x285f4738, prevoffp=0x0)
at /usr/src/sys/ufs/ufs/ufs_dirhash.c:599
#5  0x805278a0 in ufs_lookup (ap=0x285f4790)
at /usr/src/sys/ufs/ufs/ufs_lookup.c:224
#6  0x803be024 in vfs_cache_lookup (ap=Variable "ap" is not 
available.) at vnode_if.h:83

#7  0x805a08bf in VOP_LOOKUP_APV (vop=0x807945c0,
a=0x285f4850) at vnode_if.c:99
#8  0x803c4a4f in lookup (ndp=0x285f4960) at vnode_if.h:57
#9  0x803c58ba in namei (ndp=0x285f4960)
at /usr/src/sys/kern/vfs_lookup.c:215
#10 0x803d2c94 in kern_lstat (td=0xff007693aa50, path=Variable 
"path" is not available.

)   at /usr/src/sys/kern/vfs_syscalls.c:2184
#11 0x803d2f07 in lstat (td=Variable "td" is not available.
) at /usr/src/sys/kern/vfs_syscalls.c:2167
#12 0x80574e77 in syscall (frame=0x285f4c80)
at /usr/src/sys/amd64/amd64/trap.c:900
#13 0x805598ab in Xfast_syscall ()
at /usr/src/sys/amd64/amd64/exception.S:330
#14 0x00080071063c in ?? ()
Previous frame inner to this frame (corrupt stack?)

Previous to this, I had one panic while this box was being stress-tested 
before it went into production.  It's a new Dell box with a Dell/LSI RAID 
card (mfi driver).  It's a mail server, and the ufs dirhash sysctl is 
pushed up to "vfs.ufs.dirhash_maxmem=33554432".


Previous post on the previous panic back in November is here:
http://marc.info/?l=freebsd-stable&m=125901173424554&w=2

Before last night's crash, it was up for 93 days.  Nothing has changed in 
the past few days as far as software or overall load.  The crash did 
happen during or shortly after the daily periodic run.


Any interest in this one?  Is it something to file a PR on?

dmesg is below...

Thanks,

Charles
___
Charles Sprickman
NetEng/SysAdmin
Bway.net - New York's Best Internet - www.bway.net
sp...@bway.net - 212.655.9344

Copyright (c) 1992-2009 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.2-RELEASE-p4 #2: Mon Nov  2 21:55:12 EST 2009
sp...@bigmail.bway.net:/usr/obj/usr/src/sys/BWAY7-64
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Quad-Core AMD Opteron(tm) Processor 2372 HE (2094.76-MHz K8-class CPU)
  Origin = "AuthenticAMD"  Id = 0x100f42  Stepping = 2
  
Features=0x178bfbff
  Features2=0x802009>
  AMD 
Features=0xee500800
  AMD 
Features2=0x37ff,,,Prefetch>
  TSC: P-state invariant
  Cores per package: 4
usable memory = 6427951104 (6130 MB)
avail memory  = 6198829056 (5911 MB)
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
 cpu2 (AP): APIC ID:  2
 cpu3 (AP): APIC ID:  3
ioapic0: Changing APIC ID to 4
ioapic1: Changing APIC ID to 5
ioapic2: Changing APIC ID to 6
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0  irqs 0-15 on motherboard
ioapic1  irqs 32-47 on motherboard
ioapic2  irqs 64-79 on motherboard
kbd0 at kbdmux0
acpi0:  on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
ipmi0: KCS mode found at io 0xca8 on acpi
Timecounter "ACPI-safe" frequency 3579545 Hz quality 850
acpi_timer0: <32-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
acpi_hpet0:  iomem 0xfed0-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 900
pcib0:  port 0xcf8-0xcff on acpi0
pci0:  on pcib0
pcib1:  at device 1.0 on pci0
pci8:  on pcib1
pcib2:  at device 13.0 on pci8
pci9:  on pcib2
atapci0:  port 
0xdcb0-0xdcb7,0xdca0-0xdca3,0xdcb8-0xdcbf,0xdca4-0xdca7,0xdce0-0xdcef mem 
0xee2fe000-0xee2f irq 11 at device 14.0 on pci8
atapci0: [ITHREAD]
ata2:  on atapci0
ata2: [ITHREAD]
ata3:  on atapci0
ata3: [ITHREAD]
ata4:  on atapci0
ata4: [ITHREAD]
ata5:  on atapci0
ata5: [ITHREAD]
isab0:  at device 2.2 on pci0
isa0:  on isab0
ohci0:  port 0xc000-0xc0ff mem 
0xee0ed000-0xee0edfff irq 11 at device 3.0 on pci0
ohci0: [GIANT-LOCKED]
ohci0: [ITHREAD]
usb0: OHCI version 1.0, legacy support
usb0: SMM does not respo

Re: em0 freezes on ZFS server

2010-02-26 Thread Willem Jan Withagen

On 25-2-2010 23:59, Jack Vogel wrote:

The failure to "setup receive structures" means it did not have sufficient
mbufs
to setup the RX ring and buffer structs. Not sure why this results in a
lockup,
but try and increase kern.ipc.nmbclusters.

Let me know what happens,


I've doubled the value 25600 => 51200.

This is wat netstat -m told me when it refused to revive em0:

24980/2087/27067 mbufs in use (current/cache/total)
24530/1070/25600/25600 mbuf clusters in use (current/cache/total/max)
22217/741 mbuf+clusters out of packet secondary zone in use (current/cache)
0/35/35/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
55305K/2801K/58106K bytes allocated to network (current/cache/total)
0/5970/2983 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
1011716 requests for I/O initiated by sendfile
0 calls to protocol drain routines

Now I've seen some discussion on the list suggesting that full mbuf could 
also be because the device is down and the queue builds up rather rappidly 
in the mbufs.


Probably the reason why this happened yesterday is that I started doing 
major software builds (over ZFS/NFS/TCP/v3) against data stored on this box.


--WjW
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 freezes on ZFS server

2010-02-26 Thread Willem Jan Withagen

On 26-2-2010 10:58, Gerrit Kühn wrote:

On Fri, 26 Feb 2010 10:34:41 +0100 Willem Jan Withagen
wrote about Re: em0 freezes on ZFS server:

WJW>  Probably the reason why this happened yesterday is that I started
WJW>  doing major software builds (over ZFS/NFS/TCP/v3) against data stored
WJW>  on this box.

I saw a similar problem this morning and suppose it started when some
automatic backup jobs started last night. A unstable em device is a rather
bad thing, I hope increasing the buffer (mine is at 64000 now) prevents
this from happening again.


In my case it started indeed when I raised the volume of traffic. Probably I 
tripled it. Thanx for confirming like features.


I have no proof that it used to work, or something like that. Since this is 
my first ZFS box, first Areca controller. So going back in time to see if it 
just regression somewhere is rather hard.


And Yes, unstable em-device is a pain, since uptill now I considered the 
Intel chips/driver as a constant steady-state factor in my networking life.

(would only buy/recommend Intel)

I'm not shure it is the chipset/driver combo,could be that something in the 
innards of the kernel has severly changed and just starts stressing a lot more.


--WjW

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 freezes on ZFS server

2010-02-26 Thread Gerrit Kühn
On Fri, 26 Feb 2010 10:34:41 +0100 Willem Jan Withagen 
wrote about Re: em0 freezes on ZFS server:

WJW> Probably the reason why this happened yesterday is that I started
WJW> doing major software builds (over ZFS/NFS/TCP/v3) against data stored
WJW> on this box.

I saw a similar problem this morning and suppose it started when some
automatic backup jobs started last night. A unstable em device is a rather
bad thing, I hope increasing the buffer (mine is at 64000 now) prevents
this from happening again.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 freezes on ZFS server

2010-02-26 Thread Gerrit Kühn
On Thu, 25 Feb 2010 14:59:28 -0800 Jack Vogel  wrote
about Re: em0 freezes on ZFS server:

JV> The failure to "setup receive structures" means it did not have
JV> sufficient mbufs
JV> to setup the RX ring and buffer structs. 

I don't know if this is related, but I updated an amd64 zfs machine with
several em cards from 7.2 to 8-stable yesterday. First it worked fine after
booting, but this morning, at least three of the five em interfaces did
not do much anymore. You could revive them for some seconds with ifconfig
down/up, but they always ceased functioning soon after that (within
seconds).
During debugging (up/down, load/unload if_em etc.) I saw the same error
message as above at some point. I finally gave up and rebooted the
machine. For now, everything appears to be back to normal (but for how
long?).

JV> Not sure why this results in a lockup, but try and increase
JV> kern.ipc.nmbclusters.

I just did that, just to make sure.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Many many many thanks to all that develop FreeBSD.

2010-02-26 Thread Willem Jan Withagen

Hi,

When everything is life is just smoothly flowing by, and all is hunky-dory, 
some things don't get the credits they deserve.


So here we go ;)

Standing at the coffee machine this morning I realized that FreeBSD has been 
part of my professional life for already way, way too long. Before 1993 I 
was tinkering with CPM, SCO, sys3/5, i386, Apollo Domain, Linux <0.98 and sorts.


When friends an I decided to start an ISP, then very novel, now really part 
of our lives. And one of the friends suggested to use FreeBSD instead of 
Linux. As Linux at that time was still very much in flux and really just a 
flying target, we were more than up for it. Because if gave us the systems 
that we were all too familiar with. And it really did work richt out of the 
box. (well almost, needed to hack on the serial driver for the 16 port card 
we had) The first commercial server that we deployed was running FreeBSD 
1.1. I still fondly keep that CD.


One of those friends has been part of the Core team (Guido van Rooij) for a 
while. And I'm sure that a lot of the code he hacked up to keep the ISP 
going ended up somewhere in the tree. The release strategy has always been a 
real nice service to our business. Lots of systems where kept at 
major_versions, until the hardware became too old. Only then to leapfrog 
into the most stable version at that moment. And I don't ever recall that we 
were disappointed in the way FreeBSD developed itself.


The ISP was sold in 2000, and I started doing other things. But in 2000 just 
about everything the ISP was delivering ran on FreeBSD. And I remember 
boxing being up for over 2 years. (especially those not exposed to the 
public.:)) Only reason for some windows was, because business wise you can't 
do without.


At home FreeBSD has been my friend from that same moment forward. I trashed 
my Linux Toys, merged to FreeBSD and never looked back. My home is now 
running on 2* FreeBSD's and lots of Jave-embeded devices. Again there never 
disappointed in what FreeBSD delivered.


Recently I started a new company again, but much to my disappointment (but 
understandably so) the chipset supplier (NXP/Philips) delivers only support 
for Linux (or WinCE) and the new shop is mainly Linux oriented. Although a 
some of the dev-team are still FreeBSD at hart. And parts of the system are 
tested against FreeBSD for avoidance of too much Linux-isms. :)


So when I ran into some trouble yesterday, I realized how much FreeBSD has 
contributed to "painless computing"(tm). And not just only because of the 
quality of the software, but also because of the support that is offered.


And for that I would like to thank, compliment, free-beer-license all the 
people that made my life trouble free


Many many many thanks to all those that make FreeBSD.

--WjW
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: panic - sleeping thread on FreeBSD 8.0-stable / amd64

2010-02-26 Thread Torfinn Ingolfsen
On Sat, 20 Feb 2010 15:35:46 -0800
Jeremy Chadwick  wrote:

After five days - a new crash. From /var/log/messages:
Feb 26 00:57:39 kg-f2 ntpd[55453]: kernel time sync status change 6001
Feb 26 01:39:40 kg-f2 kernel: ata5: port is not ready (timeout 1ms) tfd = 
007f
Feb 26 01:39:40 kg-f2 kernel: ata5: hardware reset timeout
Feb 26 10:44:54 kg-f2 syslogd: kernel boot file is /boot/kernel/kernel

> Let's backtrack a bit.  I've gone back and read through all of your
> previous posts on this matter, and so far all the problems are happening
> on ata5 and ata6.  No timeouts or anomalies have appeared on any other
> ports -- just those two. 

It seems you are right.

> The kernel error messages indicate that
> commands submit to the controller took longer than 10 seconds to get a
> response, so the OS does a force-reset of the ports in attempt to get
> things working again.
> 
> We can safely rule out the Silicon Image controller (otherwise "ataX"
> wouldn't be involved), which leaves the AMD SB700 SATA controller and
> the AMD SB700 PATA controller.

And there is nothing connected to the pata controller.

> What exact disks (e.g. adX) are attached to ata5 and ata6?

r...@kg-f2# dmesg | grep ata5
ata5:  on atapci0
ata5: [ITHREAD]
ad10: 953869MB  at ata5-master UDMA100 SATA 3Gb/s
r...@kg-f2# dmesg | grep ata6
ata6:  on atapci0
ata6: [ITHREAD]
ad12: 953869MB  at ata6-master UDMA100 SATA 3Gb/s

> You haven't provided dmesg output in any of your posts,

No, I didn't. I did state that full dmesg's and more info was available on the 
freebsd web page[1] for the machine 
in one of my first posts.

> and atacontrol/pciconf is
> not sufficient (I should really improve atacontrol by printing this
> information.  I'll work on that in a few minutes).

Cool, I would really like that feature.

> Some Linux users have reported AHCI-related issues with the SB600
> southbridge, but the core of the problem turned out to be MSI on certain
> AMD northbridges (specifically RS480, RS400, and RS200).  By disabling
> MSI entirely they were able to achieve stability.  The FreeBSD
> equivalent would be to set the following in loader.conf and reboot:
> 
> hw.pci.enable_msix="0"
> hw.pci.enable_msi="0"

I will try that now. It might take five days or more to get an answer.

> The Linux quirk fix for this:
> 
> http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=blob_plain;f=queue-2.6.21/pci-quirks-disable-msi-on-rs400-200-and-rs480.patch;hb=05ab505f2909acf3a614d3e6a32271c4c1f8a69d
> 
> Your board has an AMD 740G northbridge, but it might be worth trying the
> MSI disable trick anyway.  If it doesn't fix the problem then definitely
> re-enable MSI.  Isn't hardware fun?  ;-)

Always. ;^)


References:
1) http://sites.google.com/site/tingox/ga-ma74gm-s2h_freebsd

-- 
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: panic - sleeping thread on FreeBSD 8.0-stable / amd64

2010-02-26 Thread Jeremy Chadwick
On Fri, Feb 26, 2010 at 11:03:37AM +0100, Torfinn Ingolfsen wrote:
> > What exact disks (e.g. adX) are attached to ata5 and ata6?
> 
> r...@kg-f2# dmesg | grep ata5
> ata5:  on atapci0
> ata5: [ITHREAD]
> ad10: 953869MB  at ata5-master UDMA100 SATA 3Gb/s
> r...@kg-f2# dmesg | grep ata6
> ata6:  on atapci0
> ata6: [ITHREAD]
> ad12: 953869MB  at ata6-master UDMA100 SATA 3Gb/s
> ...snip...
> No, I didn't. I did state that full dmesg's and more info was available on 
> the freebsd web page[1] for the machine 
> in one of my first posts.

Okay, so the breakdown for those following is:

http://sites.google.com/site/tingox/f2-dmesg-8.0-stable-20100131.txt?attredirects=0

atapci0:  port 
0xff00-0xff07,0xfe00-0xfe03,0xfd00-0xfd07,0xfc00-0xfc03,0xfb00-0xfb0f mem 
0xfe02f000-0xfe02f3ff irq 22 at device 17.0 on pci0
atapci0: [ITHREAD]
atapci0: AHCI v1.10 controller with 6 3Gbps ports, PM supported
ata2:  on atapci0
ata3:  on atapci0
ata4:  on atapci0
ata5:  on atapci0
ata6:  on atapci0
ata7:  on atapci0

ad6: 238475MB  at ata3-master UDMA100 SATA 3Gb/s
ad8: 953869MB  at ata4-master UDMA100 SATA 3Gb/s
ad10: 953869MB  at ata5-master UDMA100 SATA 3Gb/s
ad12: 953869MB  at ata6-master UDMA100 SATA 3Gb/s
ad14: 953869MB  at ata7-master UDMA100 SATA 3Gb/s

But the only ports which are having issues are ata5 and ata6, which
hosts disks ad10 and ad12 respectively.

SMART stats for ad10 and ad12 look fantastic, aside from slightly long
spin-up times (claiming over 8 seconds), but that wouldn't cause what's
seen here.  Both disks have used for nearly 1700 hours.  No SMART error
log entries exist on either disk, which means the timeouts seen when
speaking to the controller are very likely when talking to the
controller itself (and not when waiting for the controller to submit a
request to the disk and that piece stalling).

I'm out of ideas aside from the following:

1) Disabling MSI/MSIX, which at this point I'm doubting will fix
anything (but you never know), since I'd expect it to affect the
entire controller and not just specific ports on the controller.

2) Replacing the SATA cables used between ata5<-->ad10 and ata6<-->ad12.

3) Getting mav@ to talk to AMD to find out if there's any AHCI quirks in
the IXP700 or IXP800 SATA controllers, as there could be some weird
driver bug/quirk on FreeBSD which is needed.

Mainly for mav@: verbose boot messages for this system are here, in case
any SATA register details are of help:

http://sites.google.com/site/tingox/f2-dmesg-8.0-stable-20100131_verb1.txt?attredirects=0
http://sites.google.com/site/tingox/f2-dmesg-8.0-stable-20100131_verb2.txt?attredirects=0

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: panic - sleeping thread on FreeBSD 8.0-stable / amd64

2010-02-26 Thread Torfinn Ingolfsen
On Fri, 26 Feb 2010 11:03:37 +0100
Torfinn Ingolfsen  wrote:

> I will try that now. It might take five days or more to get an answer.

Or not. Another panic. Output from /var/log/messages:
Feb 26 11:10:33 kg-f2 ntpd[942]: kernel time sync status change 2001
Feb 26 11:44:19 kg-f2 kernel: ata5: port is not ready (timeout 1ms) tfd = 
0080
Feb 26 11:44:19 kg-f2 kernel: ata5: hardware reset timeout
Feb 26 11:47:05 kg-f2 kernel: ata6: port is not ready (timeout 1ms) tfd = 
007f
Feb 26 11:47:05 kg-f2 kernel: ata6: hardware reset timeout
Feb 26 11:47:05 kg-f2 kernel: ata5: port is not ready (timeout 1ms) tfd = 
007f
Feb 26 11:47:05 kg-f2 kernel: ata5: hardware reset timeout
Feb 26 11:47:05 kg-f2 kernel: ata6: port is not ready (timeout 1ms) tfd = 
007f
Feb 26 11:47:05 kg-f2 kernel: ata6: hardware reset timeout
Feb 26 11:47:05 kg-f2 kernel: ata5: port is not ready (timeout 1ms) tfd = 
007f
Feb 26 11:47:05 kg-f2 kernel: ata5: hardware reset timeout
Feb 26 11:47:05 kg-f2 kernel: ata6: port is not ready (timeout 1ms) tfd = 
0080
Feb 26 11:47:05 kg-f2 kernel: ata6: hardware reset timeout
Feb 26 11:47:05 kg-f2 kernel: ata5: port is not ready (timeout 1ms) tfd = 
007f
Feb 26 11:47:05 kg-f2 kernel: ata5: hardware reset timeout
Feb 26 11:47:05 kg-f2 kernel: ata6: port is not ready (timeout 1ms) tfd = 
007f
Feb 26 11:47:05 kg-f2 kernel: ata6: hardware reset timeout
Feb 26 11:47:05 kg-f2 kernel: ata5: port is not ready (timeout 1ms) tfd = 
007f
Feb 26 11:47:05 kg-f2 kernel: ata5: hardware reset timeout
Feb 26 11:47:05 kg-f2 kernel: ata6: port is not ready (timeout 1ms) tfd = 
0080
Feb 26 11:47:05 kg-f2 kernel: ata6: hardware reset timeout
Feb 26 11:47:05 kg-f2 kernel: ad4: TIMEOUT - FLUSHCACHE48 retrying (1 retry 
left)
Feb 26 11:47:05 kg-f2 kernel: ata5: port is not ready (timeout 1ms) tfd = 
007f
Feb 26 11:47:05 kg-f2 kernel: ata5: hardware reset timeout
Feb 26 11:47:05 kg-f2 kernel: ata6: port is not ready (timeout 1ms) tfd = 
007f
Feb 26 11:47:05 kg-f2 kernel: ata6: hardware reset timeout
Feb 26 11:47:05 kg-f2 kernel: ata5: port is not ready (timeout 1ms) tfd = 
0080
Feb 26 11:47:05 kg-f2 kernel: ata5: hardware reset timeout
Feb 26 11:47:05 kg-f2 kernel: ata6: port is not ready (timeout 1ms) tfd = 
0080
Feb 26 11:47:05 kg-f2 kernel: ata6: hardware reset timeout
Feb 26 11:47:05 kg-f2 kernel: ata5: port is not ready (timeout 1ms) tfd = 
0080
Feb 26 11:47:05 kg-f2 kernel: ata5: hardware reset timeout
Feb 26 11:47:05 kg-f2 kernel: ata6: port is not ready (timeout 1ms) tfd = 
007f
Feb 26 11:47:05 kg-f2 kernel: ata6: hardware reset timeout
Feb 26 11:47:05 kg-f2 kernel: ad6: TIMEOUT - FLUSHCACHE48 retrying (1 retry 
left)
Feb 26 11:47:05 kg-f2 kernel: ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) 
LBA=31471070
Feb 26 12:23:38 kg-f2 syslogd: kernel boot file is /boot/kernel/kernel


-- 
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 freezes on ZFS server

2010-02-26 Thread Gerrit Kühn
On Thu, 25 Feb 2010 14:59:28 -0800 Jack Vogel  wrote
about Re: em0 freezes on ZFS server:

JV> The failure to "setup receive structures" means it did not have
JV> sufficient mbufs
JV> to setup the RX ring and buffer structs.

I'm monitoring mbufs since I rebooted my server. Right now (after 2.5 hours
or so of operation) the number of total clusters has already increased to
15k. Is this a normal behaviour for a relatively idle server or will it
inevitably go through the roof in some more hours?


Every 1s: netstat -mFri Feb 26
13:14:54 2010

15001/2279/17280 mbufs in use (current/cache/total)
13970/1212/15182/64000 mbuf clusters in use (current/cache/total/max)
13970/750 mbuf+clusters out of packet secondary zone in use (current/cache)
0/119/119/12800 4k (page size) jumbo clusters in use
(current/cache/total/max) 0/0/0/6400 9k jumbo clusters in use
(current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use
(current/cache/total/max) 31690K/3469K/35160K bytes allocated to network
(current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf
+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
3 requests for I/O initiated by sendfile
0 calls to protocol drain routines



cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 freezes on ZFS server

2010-02-26 Thread Jeremy Chadwick
On Fri, Feb 26, 2010 at 10:34:41AM +0100, Willem Jan Withagen wrote:
> This is wat netstat -m told me when it refused to revive em0:

Below are the netstat -m counters/lines of concern:

> 24980/2087/27067 mbufs in use (current/cache/total)
> 24530/1070/25600/25600 mbuf clusters in use (current/cache/total/max)
> 55305K/2801K/58106K bytes allocated to network (current/cache/total)

Note how close the "current" value is to that of "total".  I'm not too
surprised you're seeing what you are as a result of this.  What on earth
is this machine doing at all times?

Comparatively, here's some of our servers' netstat -m stats.  All these
boxes do nightly backups to a centralised box on a private gigE network.
All boxes use em(4).

RELENG_7 amd64 2010/01/09 -- primary HTTP, pri DNS, SSH server + ZFS

514/1931/2445 mbufs in use (current/cache/total)
512/540/1052/25600 mbuf clusters in use (current/cache/total/max)
1152K/6394K/7547K bytes allocated to network (current/cache/total)

RELENG_7 amd64 2010/01/11 -- secondary DNS, MySQL, dev box + ZFS

514/1151/1665 mbufs in use (current/cache/total)
512/504/1016/25600 mbuf clusters in use (current/cache/total/max)
1152K/2203K/3356K bytes allocated to network (current/cache/total)

RELENG_7 i386 2008/04/19 -- secondary HTTP, SSH server, heavy memory I/O

531/624/1155 mbufs in use (current/cache/total)
512/552/1064/25600 mbuf clusters in use (current/cache/total/max)
1156K/2408K/3564K bytes allocated to network (current/cache/total)

RELENG_8 amd64 2010/02/02 -- central backups + NFS+ZFS-based filer

1572/3423/4995 mbufs in use (current/cache/total)
1563/3065/4628/25600 mbuf clusters in use (current/cache/total/max)
3519K/7401K/10920K bytes allocated to network (current/cache/total)

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 freezes on ZFS server

2010-02-26 Thread Willem Jan Withagen

On 26-2-2010 13:03, Jeremy Chadwick wrote:

On Fri, Feb 26, 2010 at 10:34:41AM +0100, Willem Jan Withagen wrote:

This is wat netstat -m told me when it refused to revive em0:


Below are the netstat -m counters/lines of concern:


24980/2087/27067 mbufs in use (current/cache/total)
24530/1070/25600/25600 mbuf clusters in use (current/cache/total/max)
55305K/2801K/58106K bytes allocated to network (current/cache/total)


Note how close the "current" value is to that of "total".  I'm not too
surprised you're seeing what you are as a result of this.  What on earth
is this machine doing at all times?

Comparatively, here's some of our servers' netstat -m stats.  All these
boxes do nightly backups to a centralised box on a private gigE network.
All boxes use em(4).

RELENG_7 amd64 2010/01/09 -- primary HTTP, pri DNS, SSH server + ZFS

514/1931/2445 mbufs in use (current/cache/total)
512/540/1052/25600 mbuf clusters in use (current/cache/total/max)
1152K/6394K/7547K bytes allocated to network (current/cache/total)


That's why I wrote that I assumed that the mbuf stats where a result of the 
pipe overflowing when the device went down.

(I've seen this discussed in another thread as well)

At the moment off em0-freeze, I was running 4 bitbake compile jobs of our 
software tree. That's a lot of small file access, but like you say nothing 
really major.


--WjW

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 freezes on ZFS server

2010-02-26 Thread Willem Jan Withagen

On 26-2-2010 13:16, Gerrit Kühn wrote:

On Thu, 25 Feb 2010 14:59:28 -0800 Jack Vogel  wrote
about Re: em0 freezes on ZFS server:

JV>  The failure to "setup receive structures" means it did not have
JV>  sufficient mbufs
JV>  to setup the RX ring and buffer structs.

I'm monitoring mbufs since I rebooted my server. Right now (after 2.5 hours
or so of operation) the number of total clusters has already increased to
15k. Is this a normal behaviour for a relatively idle server or will it
inevitably go through the roof in some more hours?


Every 1s: netstat -mFri Feb 26
13:14:54 2010

15001/2279/17280 mbufs in use (current/cache/total)
13970/1212/15182/64000 mbuf clusters in use (current/cache/total/max)
13970/750 mbuf+clusters out of packet secondary zone in use (current/cache)
0/119/119/12800 4k (page size) jumbo clusters in use
(current/cache/total/max) 0/0/0/6400 9k jumbo clusters in use
(current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use
(current/cache/total/max) 31690K/3469K/35160K bytes allocated to network
(current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf
+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
3 requests for I/O initiated by sendfile
0 calls to protocol drain routines


Well it could be coincidence, but mine are around 15k as well:

15921/3669/19590 mbufs in use (current/cache/total)
15308/2678/17986/51200 mbuf clusters in use (current/cache/total/max)
14754/862 mbuf+clusters out of packet secondary zone in use (current/cache)

System is up since last night, and has taken a few Gb in rsync-backup and 
several compile jobs.

Hasn't given me trouble (yet).

--WjW

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Panic on 8-STABLE in mpt(4) on a DELL PowerEdge R300

2010-02-26 Thread Alexander Motin
John J. Rushford wrote:
> I'm running into the same problem, mpt(4) panic on FreeBSD 8-STABLE.
> 
> I'm running FreeBSD 8.0-STABLE, the current kernel was cvsup'd and built
> @ January 14th, 2010.  I cvsup'd tonight, 2/25/2010, and built a new
> kernel.  Attached is the panic when I tried to boot into single user
> mode, I was able to boot up on the old kernel built on January 14th.
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address= 0x10
> fault code= supervisor read data, page not present
> instruction pointer= 0x20:0x8019c4bd
> stack pointer= 0x28:0xff80e81d5ba0
> frame pointer= 0x28:0xff80e81d5bd0
> code segment= base 0x0, limit 0xf, type 0x1b
>= DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process= 6 (mpt_raid0)
> trap number= 12
> panic: page fault

Attached patch should fix the problem.

-- 
Alexander Motin
--- mpt_raid.c.prev 2010-02-05 21:52:04.0 +0200
+++ mpt_raid.c  2010-02-26 14:14:30.0 +0200
@@ -690,7 +690,6 @@ mpt_raid_thread(void *arg)
 
if (mpt->raid_rescan != 0) {
union ccb *ccb;
-   struct cam_path *path;
int error;
 
mpt->raid_rescan = 0;
@@ -699,7 +698,7 @@ mpt_raid_thread(void *arg)
ccb = xpt_alloc_ccb();
 
MPT_LOCK(mpt);
-   error = xpt_create_path(&path, xpt_periph,
+   error = xpt_create_path(&ccb->ccb_h.path, xpt_periph,
cam_sim_path(mpt->phydisk_sim),
CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD);
if (error != CAM_REQ_CMP) {
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: em0 freezes on ZFS server

2010-02-26 Thread Gerrit Kühn
On Fri, 26 Feb 2010 04:03:39 -0800 Jeremy Chadwick
 wrote about Re: em0 freezes on ZFS server:

JC> Note how close the "current" value is to that of "total".  I'm not too
JC> surprised you're seeing what you are as a result of this.  What on
JC> earth is this machine doing at all times?

Well, speaking for my machine: serving some nfs dirs from zfs, do some
file transfers via rsync/scp, server some web pages (gitweb, redmine).
Really nothing spectacular. I just updated from 7.2 to 8-stable yesterday
and did not have that problem before. From my last email to now (about 15
minutes) mbuf clusters have increased from 15k to 18k. All my other
machines (even another one with 8-stable, but without nfs-services and
without em nics) have only a few k of buffers in use.
Is there any way I could find out what is actually using these buffers?


cu
  Gerrit

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 freezes on ZFS server

2010-02-26 Thread Gerrit Kühn
On Fri, 26 Feb 2010 13:31:38 +0100 Gerrit Kühn
 wrote about Re: em0 freezes on ZFS server:

GK> JC> Note how close the "current" value is to that of "total".  I'm not
GK> JC> too surprised you're seeing what you are as a result of this.
GK> JC> What on earth is this machine doing at all times?

GK> Is there any way I could find out what is actually using these buffers?

Sorry for replying to my own email:
At least in my case I found out what is eating the buffers: nfsd does!
The buffers stop increasing as soon as I stop nfsd. However, they start
increasing as soon as I start nfsd again.
Are there any ideas how to fix this? Downgrading back to 7-stable is not
really an easy task as far as I know, and I need the server to run without
having to reboot it once for twice a day...


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 freezes on ZFS server

2010-02-26 Thread Willem Jan Withagen

On 26-2-2010 13:44, Gerrit Kühn wrote:

On Fri, 26 Feb 2010 13:31:38 +0100 Gerrit Kühn
  wrote about Re: em0 freezes on ZFS server:

GK>  JC>  Note how close the "current" value is to that of "total".  I'm not
GK>  JC>  too surprised you're seeing what you are as a result of this.
GK>  JC>  What on earth is this machine doing at all times?

GK>  Is there any way I could find out what is actually using these buffers?

Sorry for replying to my own email:
At least in my case I found out what is eating the buffers: nfsd does!
The buffers stop increasing as soon as I stop nfsd. However, they start
increasing as soon as I start nfsd again.
Are there any ideas how to fix this? Downgrading back to 7-stable is not
really an easy task as far as I know, and I need the server to run without
having to reboot it once for twice a day...


Mine went up something like 200 mbufs, so that is not that significantly.

Let alone that I prefer not to do so, since the whole box is on ZFS.

--WjW

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 freezes on ZFS server

2010-02-26 Thread Gerrit Kühn
On Fri, 26 Feb 2010 15:04:37 +0200 Daniel Braniss 
wrote about Re: em0 freezes on ZFS server :

DB> > At least in my case I found out what is eating the buffers: nfsd
DB> > does! The buffers stop increasing as soon as I stop nfsd. However,
DB> > they start increasing as soon as I start nfsd again.
DB> > Are there any ideas how to fix this? Downgrading back to 7-stable is
DB> > not really an easy task as far as I know, and I need the server to
DB> > run without having to reboot it once for twice a day...

DB> I want to add some spices to this stew: :-)

You're welcome. :-)

DB> Some few day later it hung, and it's now hanging every few days.
DB> Most of the hangs are because there is no network, but the NIC is bce
DB> not em! I doubled kern.ipc.nmbclusters and lets see what happens ...

Do you have nfsd running and serving clients? If so, we should maybe
change the topic to something like "possible nfs mbuf leakage"...

DB> 23066/6634/29700 mbufs in use (current/cache/total)

My server is at 22k now, and the buffer number is still increasing every
few seconds...
Can you monitor your mbuf usage and report if it grows?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 freezes on ZFS server

2010-02-26 Thread Daniel Braniss
> On Fri, 26 Feb 2010 13:31:38 +0100 Gerrit Kühn
>  wrote about Re: em0 freezes on ZFS server:
> 
> GK> JC> Note how close the "current" value is to that of "total".  I'm not
> GK> JC> too surprised you're seeing what you are as a result of this.
> GK> JC> What on earth is this machine doing at all times?
> 
> GK> Is there any way I could find out what is actually using these buffers?
> 
> Sorry for replying to my own email:
> At least in my case I found out what is eating the buffers: nfsd does!
> The buffers stop increasing as soon as I stop nfsd. However, they start
> increasing as soon as I start nfsd again.
> Are there any ideas how to fix this? Downgrading back to 7-stable is not
> really an easy task as far as I know, and I need the server to run without
> having to reboot it once for twice a day...

I want to add some spices to this stew: :-)
I have this big server (> 10 TB) which was running pretty much without major
problems, till one morning it started panicking because some 'ZFS * credential 
*',
Since this server is used by many and uptime being a priority,
I upgraded it to 8-stable, the panic went away, one problem solved.

Some few day later it hung, and it's now hanging every few days.
Most of the hangs are because there is no network, but the NIC is bce not em!
I doubled kern.ipc.nmbclusters and lets see what happens ...

netstat -m:
23066/6634/29700 mbufs in use (current/cache/total)
22072/5942/28014/51200 mbuf clusters in use (current/cache/total/max)
22021/2939 mbuf+clusters out of packet secondary zone in use (current/cache)

hope this helps in finding a cure,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Panic on 8-STABLE in mpt(4) on a DELL PowerEdge R300

2010-02-26 Thread Lorenzo Perone

COOL! THANKS a LOT Alexander!

Can't believe it. You post a panic at 11pm and get a patch at 1pm next 
day...? You must be crazy! ;)


Works for me. I patched against 8/stable. I'll be testing the machine a 
bit more. But for now, no panics!


I guess the patch should be committed soon, also because it's really 
happening quite at beginning, without a RAC/ILO you're locked out pretty 
fast, and mpt is used on many DELL/HP setups.


while true ; do echo Thank You ; done

Lorenzo

On 26.02.10 13:25, Alexander Motin wrote:

John J. Rushford wrote:

I'm running into the same problem, mpt(4) panic on FreeBSD 8-STABLE.

I'm running FreeBSD 8.0-STABLE, the current kernel was cvsup'd and built
@ January 14th, 2010.  I cvsup'd tonight, 2/25/2010, and built a new
kernel.  Attached is the panic when I tried to boot into single user
mode, I was able to boot up on the old kernel built on January 14th.

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address= 0x10
fault code= supervisor read data, page not present
instruction pointer= 0x20:0x8019c4bd
stack pointer= 0x28:0xff80e81d5ba0
frame pointer= 0x28:0xff80e81d5bd0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process= 6 (mpt_raid0)
trap number= 12
panic: page fault


Attached patch should fix the problem.




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Sysinstall does not define SATA

2010-02-26 Thread oleg

Have a nice time!

Got some trouble. The Sysinstall program of FreeBSD 8.0 release does 
not define SATA hard drives. Can`t create slices.

But that machine works correctly by Windows.
See attached dmesg file.

Please, let me know, what mast i do?
Many thanks for the help.
---
Ванкувер 2010. Новости Олимпиады. http://olympic.aport.ru


dmesg
Description: Binary data
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: em0 freezes on ZFS server

2010-02-26 Thread Daniel Braniss
> On Fri, 26 Feb 2010 15:04:37 +0200 Daniel Braniss 
> wrote about Re: em0 freezes on ZFS server :
> 
> DB> > At least in my case I found out what is eating the buffers: nfsd
> DB> > does! The buffers stop increasing as soon as I stop nfsd. However,
> DB> > they start increasing as soon as I start nfsd again.
> DB> > Are there any ideas how to fix this? Downgrading back to 7-stable is
> DB> > not really an easy task as far as I know, and I need the server to
> DB> > run without having to reboot it once for twice a day...
> 
> DB> I want to add some spices to this stew: :-)
> 
> You're welcome. :-)
> 
> DB> Some few day later it hung, and it's now hanging every few days.
> DB> Most of the hangs are because there is no network, but the NIC is bce
> DB> not em! I doubled kern.ipc.nmbclusters and lets see what happens ...
> 
> Do you have nfsd running and serving clients? If so, we should maybe
> change the topic to something like "possible nfs mbuf leakage"...
> 
it's only purpose in life is a nfs server.
but I wouldn't exclude zfs from the equation yet.
I have othere nfs servers, not doing zfs and I don't see this.

> DB> 23066/6634/29700 mbufs in use (current/cache/total)
> 
> My server is at 22k now, and the buffer number is still increasing every
> few seconds...
> Can you monitor your mbuf usage and report if it grows?
> 
I am, and in the last 2hs. it grew by about 300, it does oscilate, i.e. it 
grows some, then
it goes down, but it seems that the low always increases.

when I have enough data i'll plot it.

Cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Sysinstall does not define SATA

2010-02-26 Thread Jeremy Chadwick
On Fri, Feb 26, 2010 at 05:21:52PM +0300, oleg wrote:
> Got some trouble. The Sysinstall program of FreeBSD 8.0 release does
> not define SATA hard drives. Can`t create slices.
> But that machine works correctly by Windows.
> See attached dmesg file.

The hard disks are seen by the system as classic PATA disks, operating
in PIO4 mode, which probably indicates your BIOS is set to run the
controller in "Emulation" mode.  sysinstall should, I would think, see
these disks since the kernel does.

atapci1:  port 
0xd480-0xd487,0xd400-0xd403,0xd080-0xd087,0xd000-0xd003,0xcc00-0xcc0f mem 
0xf9f76000-0xf9f77fff irq 9 at device 9.0 on pci0
ata2:  on atapci1
ata3:  on atapci1
ad4: 305245MB  at ata2-master PIO4
ad6: 476940MB  at ata3-master PIO4

I've never seen the vendor string "GENERIC ATA controller" before.  What
exact motherboard or SATA controller card is this?  Can you provide a
link to it?

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Sysinstall does not define SATA

2010-02-26 Thread Thomas Ronner

Hello,

On 26 Feb 2010, at 15:21, oleg wrote:


Got some trouble. The Sysinstall program of FreeBSD 8.0 release does  
not define SATA hard drives. Can`t create slices.

But that machine works correctly by Windows.
See attached dmesg file.


The dmesg you attached is from FreeBSD 6.1, not 8.0. It probably lacks  
the driver for your SATA chipset.




Regards,
Thomas
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Sysinstall does not define SATA

2010-02-26 Thread Jeremy Chadwick
On Fri, Feb 26, 2010 at 07:19:25AM -0800, Jeremy Chadwick wrote:
> On Fri, Feb 26, 2010 at 05:21:52PM +0300, oleg wrote:
> > Got some trouble. The Sysinstall program of FreeBSD 8.0 release does
> > not define SATA hard drives. Can`t create slices.
> > But that machine works correctly by Windows.
> > See attached dmesg file.
> 
> The hard disks are seen by the system as classic PATA disks, operating
> in PIO4 mode, which probably indicates your BIOS is set to run the
> controller in "Emulation" mode.  sysinstall should, I would think, see
> these disks since the kernel does.
> 
> atapci1:  port 
> 0xd480-0xd487,0xd400-0xd403,0xd080-0xd087,0xd000-0xd003,0xcc00-0xcc0f mem 
> 0xf9f76000-0xf9f77fff irq 9 at device 9.0 on pci0
> ata2:  on atapci1
> ata3:  on atapci1
> ad4: 305245MB  at ata2-master PIO4
> ad6: 476940MB  at ata3-master PIO4
> 
> I've never seen the vendor string "GENERIC ATA controller" before.  What
> exact motherboard or SATA controller card is this?  Can you provide a
> link to it?

Note to others on the list: mail to the OP results in an autoresponder
(not an SMTP-level bounce) stating the following:

> From: oleg 
> To: Jeremy Chadwick 
> Date: Fri, 26 Feb 2010 18:19:29 +0300
> Subject: Re: Sysinstall does not define SATA
> 
> This mail box closed

So I'm not sure the OP is reachable or will see our responses.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 freezes on ZFS server

2010-02-26 Thread Willem Jan Withagen

On 26-2-2010 16:07, Daniel Braniss wrote:

On Fri, 26 Feb 2010 15:04:37 +0200 Daniel Braniss
wrote about Re: em0 freezes on ZFS server :

DB>  >  At least in my case I found out what is eating the buffers: nfsd
DB>  >  does! The buffers stop increasing as soon as I stop nfsd. However,
DB>  >  they start increasing as soon as I start nfsd again.
DB>  >  Are there any ideas how to fix this? Downgrading back to 7-stable is
DB>  >  not really an easy task as far as I know, and I need the server to
DB>  >  run without having to reboot it once for twice a day...

DB>  I want to add some spices to this stew: :-)

You're welcome. :-)

DB>  Some few day later it hung, and it's now hanging every few days.
DB>  Most of the hangs are because there is no network, but the NIC is bce
DB>  not em! I doubled kern.ipc.nmbclusters and lets see what happens ...

Do you have nfsd running and serving clients? If so, we should maybe
change the topic to something like "possible nfs mbuf leakage"...


it's only purpose in life is a nfs server.
but I wouldn't exclude zfs from the equation yet.
I have othere nfs servers, not doing zfs and I don't see this.


DB>  23066/6634/29700 mbufs in use (current/cache/total)

My server is at 22k now, and the buffer number is still increasing every
few seconds...
Can you monitor your mbuf usage and report if it grows?


I am, and in the last 2hs. it grew by about 300, it does oscilate, i.e. it
grows some, then
it goes down, but it seems that the low always increases.

when I have enough data i'll plot it.


Same here, from the odd samples I make.

If I don't run my compiles, but just let the server sit.
It just oscilates +- 500 mbufs.

Once I start compiling, mbufs starts growing. I'm now at 22K

Main network (NFS) traffic is than
postfix writing Maildir
dovecot - Maildir work

Next to that it is also Samba server, but that doesn't seem to matter that 
much.


--WjW
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 freezes on ZFS server

2010-02-26 Thread Daniel Braniss

> when I have enough data i'll plot it.
> 
check:
ftp://ftp.cs.huji.ac.il/users/danny/freebsd/plot.ps
x is seconds, y is mbus current.

> Cheers,
>   danny
> 
> 
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 freezes on ZFS server

2010-02-26 Thread Gerrit Kühn
On Fri, 26 Feb 2010 17:07:13 +0200 Daniel Braniss 
wrote about Re: em0 freezes on ZFS server :

DB> it's only purpose in life is a nfs server.

I thought so, but you did not mention it explicitely.

DB> but I wouldn't exclude zfs from the equation yet.
DB> I have othere nfs servers, not doing zfs and I don't see this.

My machine has zfs, too. I do not have 8-stable with nfs on ufs, so I
cannot crosscheck that.

DB> > My server is at 22k now, and the buffer number is still increasing
DB> > every few seconds...
DB> > Can you monitor your mbuf usage and report if it grows?

DB> I am, and in the last 2hs. it grew by about 300, it does oscilate,
DB> i.e. it grows some, then
DB> it goes down, but it seems that the low always increases.

Mine is at 36k now:

36797/3403/40200 mbufs in use (current/cache/total)
35772/1202/36974/65000 mbuf clusters in use (current/cache/total/max)
35772/836 mbuf+clusters out of packet secondary zone in use (current/cache)

DB> when I have enough data i'll plot it.

I think I'll reboot my machine now and hope that it lives as long as
possible into the weekend. Although at the present rate it will not
survive 24h. :-(


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


mbuf leakage with nfs/zfs? (was: em0 freezes on ZFS server)

2010-02-26 Thread Gerrit Kühn
On Fri, 26 Feb 2010 17:41:02 +0200 Daniel Braniss 
wrote about Re: em0 freezes on ZFS server :

DB> check:
DB> ftp://ftp.cs.huji.ac.il/users/danny/freebsd/plot.ps
DB> x is seconds, y is mbus current.

Looks not as bad as mine. I had 37k when I rebooted the machine some
minutes ago (and it's basically idle, just serving a few nfs clients that
don't do much).
But from the values Jeremy has posted and from my own comparsisons here I
would think that something like 5k of mbuf clusters would be normal for my
machine (and probably also for yours).

Some more info from my side:
In the meantime I also tried a different network interface. The
nfe-interface that is onboard causes the same problems, so it is probably
not an em-specific issue.
Furthermore I found this via Google:
.
I patched and recompiled my kernel with this, just to try it out. Right
now I have

2264/1321/3585 mbufs in use (current/cache/total)
1239/1017/2256/65000 mbuf clusters in use (current/cache/total/max)
1239/809 mbuf+clusters out of packet secondary zone in use (current/cache)

but the uptime is only 12min so far. In some hours I'll know for certain
if this patch has anything to do with the problem.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Sysinstall does not define SATA

2010-02-26 Thread oleg

On Fri, 26 Feb 2010 07:19:25 -0800
 Jeremy Chadwick  wrote:

On Fri, Feb 26, 2010 at 05:21:52PM +0300, oleg wrote:

Got some trouble. The Sysinstall program of FreeBSD 8.0 release does
not define SATA hard drives. Can`t create slices.
But that machine works correctly by Windows.
See attached dmesg file.


The hard disks are seen by the system as classic PATA disks, 
operating

in PIO4 mode, which probably indicates your BIOS is set to run the
controller in "Emulation" mode.  sysinstall should, I would think, 
see

these disks since the kernel does.

atapci1:  port 
0xd480-0xd487,0xd400-0xd403,0xd080-0xd087,0xd000-0xd003,0xcc00-0xcc0f 
mem 0xf9f76000-0xf9f77fff irq 9 at device 9.0 on pci0

ata2:  on atapci1
ata3:  on atapci1
ad4: 305245MB  at ata2-master PIO4
ad6: 476940MB  at ata3-master PIO4

I've never seen the vendor string "GENERIC ATA controller" before. 
What
exact motherboard or SATA controller card is this?  Can you provide 
a

link to it?

--
| Jeremy Chadwick   j...@parodius.com 
|
| Parodius Networking   http://www.parodius.com/ 
|
| UNIX Systems Administrator  Mountain View, CA, USA 
|
| Making life hard for others since 1977.  PGP: 4BD6C0CB 
|





**

Dear Jeremy Chadwick, many Thanks for answer.


My hardware is:

AM2+ Elitgroup GF8100vm-m5 - sata, II-RAID
AMD 64 X 2
DDR2 - 2Gb 800MHz
PCI Ex GeForce 9600 DDR2
SATA II 320 Samsung 321KJ 7200 16mb
SATAII 500 Seagate ST3500418as Barracuda 7200
DVD-RW Sony 5240S SATA

& link of the motherboad is 
http://www.eclipsecomputers.com/product.aspx?code=MBE-GF8100VMM5


I can offer the listing of pciconf -lv, (see attached) if you have 
interest to help me. But that listing

received in sysadmin tools Frenzy on a base of FreeBSD-6.

My gratitude for you. Yours Oleg from Russia.
---
Ванкувер 2010. Новости Олимпиады. http://olympic.aport.ru


pciconf
Description: Binary data
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Panic on 8-STABLE in pfctl with options VIMAGE on a DELL PowerEdge R300 (bge)

2010-02-26 Thread Lorenzo Perone


Hello,

Just encountered a panic when starting pf (/etc/rc.d/pf start) on a 
FreeBSD benjamin 8.0-STABLE


uname -a

FreeBSD 8.0-STABLE #0: Fri Feb 26 18:33:44 UTC 2010 
r...@benjamin:/usr/obj/usr/src/sys/BYTESATWORK_R8_INTEL_DEBUG  amd64


the system is a Dell PowerEdge R300 with bge interfaces, 16 GB RAM 
(dmesg attached).


Panic and trace remote console screenshots:

http://lorenzo.yellowspace.net/R300_pfctl_panic.gif
http://lorenzo.yellowspace.net/R300_pfctl_panic_trace.gif

Excerpt transcript:

panic:

Fatal trap 12: page fault while in kernel mode
current process = 1302
Stopped at pfil_head_get+0x41 movq 0x28(%rcx),%rdx

trace:

pfil_head_get() at pfil_head_get+0x41
pfioctl() at pfioctl+0x3351
devfs_ioctl_f() at devfs_ioctl_f+0x71
kern_ioctl() at kern_ioctl+0xe4
ioctl() at ioctl+0xed
syscall() at syscall+0x1e7
Xfast_syscall() at Xfast_syscall+0xe1

While I was just planning to experiment with VIMAGE, and it is not 
required for production (I'm aware of the message of it being 
experimental...), I thought it might be useful to report it. Please send 
me a note if I should file a pr.


The panic does not occur with the same kernel compiled without options 
VIMAGE.


Note that the dmesg is from the system booted with the kernel without 
VIMAGE, that's why it doesn't contain the warning.


big Regards to all the team,

Lorenzo

Copyright (c) 1992-2010 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.0-STABLE #0: Fri Feb 26 18:33:44 UTC 2010
r...@benjamin:/usr/obj/usr/src/sys/BYTESATWORK_R8_INTEL_NO_VIMAGE_DEBUG 
amd64
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(R) CPU   X3363  @ 2.83GHz (2833.34-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x1067a  Stepping = 10
  
Features=0xbfebfbff
  
Features2=0x40ce3bd
  AMD Features=0x20100800
  AMD Features2=0x1
  TSC: P-state invariant
real memory  = 17179869184 (16384 MB)
avail memory = 16542048256 (15775 MB)
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
 cpu2 (AP): APIC ID:  2
 cpu3 (AP): APIC ID:  3
ioapic0: Changing APIC ID to 4
ioapic0  irqs 0-23 on motherboard
kbd1 at kbdmux0
cryptosoft0:  on motherboard
acpi0:  on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
acpi_hpet0:  iomem 0xfed0-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 900
pcib0:  port 0xcf8-0xcff on acpi0
pci0:  on pcib0
pcib1:  at device 2.0 on pci0
pci3:  on pcib1
pcib2:  at device 3.0 on pci0
pci4:  on pcib2
pcib3:  at device 4.0 on pci0
pci5:  on pcib3
mpt0:  port 0xec00-0xecff mem 
0xdfcec000-0xdfce,0xdfcf-0xdfcf irq 16 at device 0.0 on pci5
mpt0: [ITHREAD]
mpt0: MPI Version=1.5.18.0
mpt0: Capabilities: ( RAID-0 RAID-1E RAID-1 )
mpt0: 1 Active Volume (2 Max)
mpt0: 2 Hidden Drive Members (14 Max)
pcib4:  at device 5.0 on pci0
pci6:  on pcib4
pcib5:  at device 6.0 on pci0
pci7:  on pcib5
pcib6:  at device 7.0 on pci0
pci8:  on pcib6
pcib7:  irq 16 at device 28.0 on pci0
pci9:  on pcib7
pcib8:  irq 16 at device 28.4 on pci0
pci1:  on pcib8
bge0:  mem 
0xdfdf-0xdfdf irq 16 at device 0.0 on pci1
miibus0:  on bge0
brgphy0:  PHY 1 on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
bge0: Ethernet address: 00:26:b9:50:03:3e
bge0: [FILTER]
pcib9:  irq 17 at device 28.5 on pci0
pci2:  on pcib9
bge1:  mem 
0xdfef-0xdfef irq 17 at device 0.0 on pci2
miibus1:  on bge1
brgphy1:  PHY 1 on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
bge1: Ethernet address: 00:26:b9:50:03:3f
bge1: [FILTER]
uhci0:  port 0xcc80-0xcc9f irq 21 at device 
29.0 on pci0
uhci0: [ITHREAD]
usbus0:  on uhci0
uhci1:  port 0xcca0-0xccbf irq 20 at device 
29.1 on pci0
uhci1: [ITHREAD]
usbus1:  on uhci1
uhci2:  port 0xccc0-0xccdf irq 21 at device 
29.2 on pci0
uhci2: [ITHREAD]
usbus2:  on uhci2
ehci0:  mem 0xdfaffc00-0xdfaf irq 
21 at device 29.7 on pci0
ehci0: [ITHREAD]
usbus3: EHCI version 1.0
usbus3:  on ehci0
pcib10:  at device 30.0 on pci0
pci10:  on pcib10
vgapci0:  port 0xdc00-0xdcff mem 
0xd000-0xd7ff,0xdfff-0xdfff irq 19 at device 7.0 on pci10
isab0:  at device 31.0 on pci0
isa0:  on isab0
atapci0:  port 
0xcc20-0xcc27,0xcc10-0xcc13,0xcc28-0xcc2f,0xcc14-0xcc17,0xcc40-0xcc4f,0xcc50-0xcc5f
 irq 23 at device 31.2 on pci0
atapci0: [ITHREAD]
ata2:  on atapci0
ata2: [ITHREAD]
ata3:  on atapci0
ata3: [ITHREAD]
atapci1:  port 
0xcc30-0xcc37,0xcc18-0xcc1b,0xcc38-0xcc3f,0xcc1c-0xcc1f,0xcc60-0xcc6f,0xcc70-0xcc7f
 irq 22 at device 31.5 on pci0
atapci1: [ITHREAD]
ata4:  on atapci1
ata4: [ITH

Re: Panic on 8-STABLE in pfctl with options VIMAGE on a DELL PowerEdge R300 (bge)

2010-02-26 Thread Bjoern A. Zeeb

On Fri, 26 Feb 2010, Lorenzo Perone wrote:

Hi,

While I was just planning to experiment with VIMAGE, and it is not required 
for production (I'm aware of the message of it being experimental...), I 
thought it might be useful to report it. Please send me a note if I should 
file a pr.


The panic does not occur with the same kernel compiled without options 
VIMAGE.


FAQ from virtualization@ ; pf support for VIMAGE only basically exists
here:  http://svn.freebsd.org/base/user/eri/pf45/head/
but is not fully ready either.

/bz

--
Bjoern A. Zeeb It will not break if you know what you are doing.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


ipfw & natd with recent MFC of firewall_coscripts functionality

2010-02-26 Thread Bob Willcox
I just updated my gateway machine to 7.3-PRERELEASE and immediately noticed
that natd no longer started (hard to miss, no outside network access).

It looks like the MFC of the firewall_coscripts function may be the cause
(cvs rev 1.15.2.3 to /usr/src/etc/rc.d/ipfw). These changes add the two lines
(along with other stuff):

...
   ${_coscript} quietstart
...
   ${_coscript} quietstop
...

I believe the problem is that neither "quietstart" or "quietstop" are
recognized as valid arguments in by /etc/rc.d/natd so natd isn't started.
Further, my hunch is that by removing the "quiet" prefix it will work (I'm
reluctant to try this at the moment as I am remote).

Bob

-- 
Bob Willcox The shifts of Fortune test the reliability of friends.
b...@immure.com-- Marcus Tullius Cicero
Austin, TX
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: mbuf leakage with nfs/zfs? (was: em0 freezes on ZFS server)

2010-02-26 Thread Daniel Braniss
> On Fri, 26 Feb 2010 17:41:02 +0200 Daniel Braniss 
> wrote about Re: em0 freezes on ZFS server :
> 
> DB> check:
> DB>   ftp://ftp.cs.huji.ac.il/users/danny/freebsd/plot.ps
> DB> x is seconds, y is mbus current.
> 
> Looks not as bad as mine. I had 37k when I rebooted the machine some
> minutes ago (and it's basically idle, just serving a few nfs clients that
> don't do much).
> But from the values Jeremy has posted and from my own comparsisons here I
> would think that something like 5k of mbuf clusters would be normal for my
> machine (and probably also for yours).
> 
> Some more info from my side:
> In the meantime I also tried a different network interface. The
> nfe-interface that is onboard causes the same problems, so it is probably
> not an em-specific issue.
> Furthermore I found this via Google:
> .
I'll have to do some packet snooping to check if it's TCP or UDP nfs traffic,
since some of the clients are Linux ...

> I patched and recompiled my kernel with this, just to try it out. Right
> now I have
> 
> 2264/1321/3585 mbufs in use (current/cache/total)
> 1239/1017/2256/65000 mbuf clusters in use (current/cache/total/max)
> 1239/809 mbuf+clusters out of packet secondary zone in use (current/cache)
> 
> but the uptime is only 12min so far. In some hours I'll know for certain
> if this patch has anything to do with the problem.

at the moment there is not much activity, but if you check the latest plot.ps 
you will
see that the bottom is slowly increasing, so my bet is that there must be some
leakage!

cheers
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Panic on 8-STABLE in pfctl with options VIMAGE on a DELL PowerEdge R300 (bge)

2010-02-26 Thread Onur Bektas

Hi,

I had the same problem, I have updated the source tree and recompile the 
kernel  but the problem is  still unresolved..

pf + vimage = "kernel crash" problem was also reported as a bug..

http://www.freebsd.org/cgi/query-pr.cgi?pr=143808&cat=

Regards,

Onur.


On 2/26/2010 8:18 PM, Lorenzo Perone wrote:


Hello,

Just encountered a panic when starting pf (/etc/rc.d/pf start) on a 
FreeBSD benjamin 8.0-STABLE


uname -a

FreeBSD 8.0-STABLE #0: Fri Feb 26 18:33:44 UTC 2010 
r...@benjamin:/usr/obj/usr/src/sys/BYTESATWORK_R8_INTEL_DEBUG  amd64


the system is a Dell PowerEdge R300 with bge interfaces, 16 GB RAM 
(dmesg attached).


Panic and trace remote console screenshots:

http://lorenzo.yellowspace.net/R300_pfctl_panic.gif
http://lorenzo.yellowspace.net/R300_pfctl_panic_trace.gif

Excerpt transcript:

panic:

Fatal trap 12: page fault while in kernel mode
current process = 1302
Stopped at pfil_head_get+0x41 movq 0x28(%rcx),%rdx

trace:

pfil_head_get() at pfil_head_get+0x41
pfioctl() at pfioctl+0x3351
devfs_ioctl_f() at devfs_ioctl_f+0x71
kern_ioctl() at kern_ioctl+0xe4
ioctl() at ioctl+0xed
syscall() at syscall+0x1e7
Xfast_syscall() at Xfast_syscall+0xe1

While I was just planning to experiment with VIMAGE, and it is not 
required for production (I'm aware of the message of it being 
experimental...), I thought it might be useful to report it. Please 
send me a note if I should file a pr.


The panic does not occur with the same kernel compiled without options 
VIMAGE.


Note that the dmesg is from the system booted with the kernel without 
VIMAGE, that's why it doesn't contain the warning.


big Regards to all the team,

Lorenzo



___
freebsd-j...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-jail
To unsubscribe, send any mail to "freebsd-jail-unsubscr...@freebsd.org"



Onur BEKTAS
Sistem Yöneticisi / System Administrator
TÜBITAK ULAKBIM

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: mbuf leakage with nfs/zfs? (was: em0 freezes on ZFS server)

2010-02-26 Thread Gerrit Kühn
On Fri, 26 Feb 2010 22:09:32 +0200 Daniel Braniss 
wrote about Re: mbuf leakage with nfs/zfs? (was: em0 freezes on ZFS
server) :

DB> > Furthermore I found this via Google:
DB> > 
.

This did not help, I still see the same problem.

DB> I'll have to do some packet snooping to check if it's TCP or UDP nfs
DB> traffic, since some of the clients are Linux ...

I have Linux clients, too. Some use tcp, some udp.

DB> > 2264/1321/3585 mbufs in use (current/cache/total)
DB> > 1239/1017/2256/65000 mbuf clusters in use (current/cache/total/max)
DB> > 1239/809 mbuf+clusters out of packet secondary zone in use
DB> > (current/cache)

DB> > but the uptime is only 12min so far. In some hours I'll know for
DB> > certain if this patch has anything to do with the problem.

It did not help. In the meantime the values read

20555/1465/22020 mbufs in use (current/cache/total)
19529/1029/20558/65000 mbuf clusters in use (current/cache/total/max)
19529/823 mbuf+clusters out of packet secondary zone in use (current/cache)


I created a little graph here:
.

y-axis are the total mbuf clusters, x-axis in minutes. The flat part in
the upper right corner is a 10min-interval when I had stopped nfsd.

DB> at the moment there is not much activity, but if you check the latest
DB> plot.ps you will see that the bottom is slowly increasing, so my bet
DB> is that there must be some leakage!

There certainly is. I wonder when this came in and why it has gone
unnoticed so far. Probably not all people serving nfs from zfs see this,
or this would have popped up earlier. Maybe the Linux clients are somehow
triggering the issue? Or did it start with the import of zvol version 14?
Unfortunately I have upgraded my pool, so I cannot easily go back to 8-REL
to test this (otoh, I need a stable server quite urgently).


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: mbuf leakage with nfs/zfs? (was: em0 freezes on ZFS server)

2010-02-26 Thread Gerrit Kühn
On Fri, 26 Feb 2010 22:09:32 +0200 Daniel Braniss 
wrote about Re: mbuf leakage with nfs/zfs? (was: em0 freezes on ZFS
server) :

DB> at the moment there is not much activity, but if you check the latest
DB> plot.ps you will see that the bottom is slowly increasing, so my bet
DB> is that there must be some leakage!

BTW: I filed a PR for this:



cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: mbuf leakage with nfs/zfs?

2010-02-26 Thread Willem Jan Withagen

On 26-2-2010 22:43, Gerrit Kühn wrote:

DB>  I'll have to do some packet snooping to check if it's TCP or UDP nfs
DB>  traffic, since some of the clients are Linux ...

I have Linux clients, too. Some use tcp, some udp.


I have Linux and FreeBSD clients running. The build system runs on 
Linux. All Linux's are UDP


Also connect the build machine to the old 7.2/amd64/bge0/ufs machine, 
but there the count doesn't go over a few 1000 mbufs.



It did not help. In the meantime the values read

20555/1465/22020 mbufs in use (current/cache/total)
19529/1029/20558/65000 mbuf clusters in use (current/cache/total/max)
19529/823 mbuf+clusters out of packet secondary zone in use (current/cache)


Mine are now:
41533/2402/43935 mbufs in use (current/cache/total)
41454/1572/43026/262144 mbuf clusters in use (current/cache/total/max)
39241/823 mbuf+clusters out of packet secondary zone in use (current/cache)


There certainly is. I wonder when this came in and why it has gone
unnoticed so far. Probably not all people serving nfs from zfs see this,
or this would have popped up earlier. Maybe the Linux clients are somehow
triggering the issue? Or did it start with the import of zvol version 14?
Unfortunately I have upgraded my pool, so I cannot easily go back to 8-REL
to test this (otoh, I need a stable server quite urgently).


', I did set the zvol version this morning also to 14 but I think 
that I ran into trouble already when still running version 13.


And the server was used as storage for the build system since the last 2 
weeks. Uptil yesterday without much trouble.


--WjW
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: if_bge upload stalls repeatedly (Was: 8-STABLE outgoing scp stalling frequently)

2010-02-26 Thread Pyun YongHyeon
On Fri, Feb 05, 2010 at 10:31:37AM +1300, Jonathan Chen wrote:
> On Thu, Feb 04, 2010 at 11:23:15AM -0800, Pyun YongHyeon wrote:
> > On Thu, Feb 04, 2010 at 03:00:15PM +1300, Jonathan Chen wrote:
> > > On Wed, Feb 03, 2010 at 05:25:03PM -0800, Pyun YongHyeon wrote:
> [...]
> > > > I'm not sure but recently added code to support TSO may cause the
> > > > issue. Would you show me verbose boot output(only bge(4) related
> > > > one)?
> > > 
> > > bge0:  > > 0x00a002> mem 0xf1bf-0xf1bf irq 17 at device 0.0 on pci9
> > > bge0: Reserved 0x1 bytes for rid 0x10 type 3 at 0xf1bf
> > > bge0: adjust device control 0x2000 -> 0x5000
> > > bge0: attempting to allocate 1 MSI vectors (1 supported)
> > > bge0: using IRQ 258 for MSI
> > > bge0: CHIP ID 0xa002; ASIC REV 0x0a; CHIP REV 0xa0; PCI-E
> > > bge0: Disabling fastboot
> > > bge0: Disabling fastboot
> > > miibus0:  on bge0
> > > bge0: bpf attached
> > > bge0: Ethernet address: 00:1d:09:d2:d1:9e
> > > bge0: [MPSAFE]
> > > bge0: [FILTER]
> > > bge0: Disabling fastboot
> > > bge0: Disabling fastboot
> > > bge0: link UP
> > > 
> > > >To rule out possible TSO issue, disable TSO and try it
> > > > again(#ifconfig bge0 -tso). Does it make any difference?
> > > 
> > > Yup, it sure does! With a TSO disabled, my upload and download speeds
> > > are pretty much symmetrical at a decent 10MB/s.
> > > 
> > 
> > Hmm, that means TSO was broken on your controller. Because BCM5755
> > or newer controllers have no known TSO issues I don't know why the
> > controller fails on TSO. Very recent controllers use new TSO format
> > but I don't think your controller is one of them and FreeBSD has no
> > support for these controllers anyway.
> > Would you show me the output of "pciconf -lcv" of your bge(4)
> > controller?
> 
> b...@pci0:9:0:0:class=0x02 card=0x01fe1028 chip=0x167314e4 
> rev=0x02 hdr=0x00
> vendor = 'Broadcom Corporation'
> device = 'NetXtreme BCM5755M Gigabit Ethernet PCIe'
> class  = network
> subclass   = ethernet
> cap 01[48] = powerspec 3  supports D0 D3  current D0
> cap 03[50] = VPD
> cap 09[58] = vendor (length 120)
> cap 05[e8] = MSI supports 1 message, 64 bit enabled with 1 message
> cap 10[d0] = PCI-Express 1 endpoint max data 128(128) link x1(x1)
> 
> This is on a Dell Latitude D830 Laptop.
> 

I committed a fix which disables TSO on BCM5755M. Still have no
idea why it fails though.
Thanks for reporting!
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


[releng_8 tinderbox] failure on powerpc/powerpc

2010-02-26 Thread FreeBSD Tinderbox
TB --- 2010-02-26 21:24:43 - tinderbox 2.6 running on freebsd-current.sentex.ca
TB --- 2010-02-26 21:24:43 - starting RELENG_8 tinderbox run for powerpc/powerpc
TB --- 2010-02-26 21:24:43 - cleaning the object tree
TB --- 2010-02-26 21:25:00 - cvsupping the source tree
TB --- 2010-02-26 21:25:00 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca 
/tinderbox/RELENG_8/powerpc/powerpc/supfile
TB --- 2010-02-26 21:25:29 - building world
TB --- 2010-02-26 21:25:29 - MAKEOBJDIRPREFIX=/obj
TB --- 2010-02-26 21:25:29 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
TB --- 2010-02-26 21:25:29 - TARGET=powerpc
TB --- 2010-02-26 21:25:29 - TARGET_ARCH=powerpc
TB --- 2010-02-26 21:25:29 - TZ=UTC
TB --- 2010-02-26 21:25:29 - __MAKE_CONF=/dev/null
TB --- 2010-02-26 21:25:29 - cd /src
TB --- 2010-02-26 21:25:29 - /usr/bin/make -B buildworld
>>> World build started on Fri Feb 26 21:25:29 UTC 2010
>>> Rebuilding the temporary build tree
>>> stage 1.1: legacy release compatibility shims
>>> stage 1.2: bootstrap tools
>>> stage 2.1: cleaning up the object tree
>>> stage 2.2: rebuilding the object tree
>>> stage 2.3: build tools
>>> stage 3: cross tools
>>> stage 4.1: building includes
>>> stage 4.2: building libraries
>>> stage 4.3: make dependencies
>>> stage 4.4: building everything
>>> World build completed on Fri Feb 26 22:21:53 UTC 2010
TB --- 2010-02-26 22:21:53 - generating LINT kernel config
TB --- 2010-02-26 22:21:53 - cd /src/sys/powerpc/conf
TB --- 2010-02-26 22:21:53 - /usr/bin/make -B LINT
TB --- 2010-02-26 22:21:53 - building LINT kernel
TB --- 2010-02-26 22:21:53 - MAKEOBJDIRPREFIX=/obj
TB --- 2010-02-26 22:21:53 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
TB --- 2010-02-26 22:21:53 - TARGET=powerpc
TB --- 2010-02-26 22:21:53 - TARGET_ARCH=powerpc
TB --- 2010-02-26 22:21:53 - TZ=UTC
TB --- 2010-02-26 22:21:53 - __MAKE_CONF=/dev/null
TB --- 2010-02-26 22:21:53 - cd /src
TB --- 2010-02-26 22:21:53 - /usr/bin/make -B buildkernel KERNCONF=LINT
>>> Kernel build for LINT started on Fri Feb 26 22:21:53 UTC 2010
>>> stage 1: configuring the kernel
>>> stage 2.1: cleaning up the object tree
>>> stage 2.2: rebuilding the object tree
>>> stage 2.3: build tools
>>> stage 3.1: making dependencies
>>> stage 3.2: building everything
[...]
:> export_syms
awk -f /src/sys/conf/kmod_syms.awk atapicam.kld  export_syms | xargs -J% 
objcopy % atapicam.kld
ld -Bshareable  -d -warn-common -o atapicam.ko atapicam.kld
objcopy --strip-debug atapicam.ko
===> ath (all)
cc -O2 -pipe -fno-strict-aliasing -Werror -D_KERNEL -DKLD_MODULE -nostdinc  -I. 
-I/src/sys/modules/ath/../../dev/ath 
-I/src/sys/modules/ath/../../dev/ath/ath_hal -DHAVE_KERNEL_OPTION_HEADERS 
-include /obj/powerpc/src/sys/LINT/opt_global.h -I. -I@ -I@/contrib/altq 
-finline-limit=15000 --param inline-unit-growth=100 --param 
large-function-growth=1000 -fno-common  -mlongcall -fno-omit-frame-pointer 
-I/obj/powerpc/src/sys/LINT -msoft-float -mno-altivec -ffreestanding 
-fstack-protector -std=iso9899:1999 -fstack-protector -Wall -Wredundant-decls 
-Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith 
-Winline -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions -c 
/src/sys/modules/ath/../../dev/ath/if_ath.c
/src/sys/modules/ath/../../dev/ath/if_ath.c: In function 'ath_key_alloc':
/src/sys/modules/ath/../../dev/ath/if_ath.c:2240: error: expected expression 
before '/' token
*** Error code 1

Stop in /src/sys/modules/ath.
*** Error code 1

Stop in /src/sys/modules.
*** Error code 1

Stop in /obj/powerpc/src/sys/LINT.
*** Error code 1

Stop in /src.
*** Error code 1

Stop in /src.
TB --- 2010-02-26 22:32:52 - WARNING: /usr/bin/make returned exit code  1 
TB --- 2010-02-26 22:32:52 - ERROR: failed to build lint kernel
TB --- 2010-02-26 22:32:52 - 3311.13 user 560.65 system 4088.76 real


http://tinderbox.freebsd.org/tinderbox-releng_8-RELENG_8-powerpc-powerpc.full
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


[releng_8 tinderbox] failure on sparc64/sparc64

2010-02-26 Thread FreeBSD Tinderbox
TB --- 2010-02-26 21:41:33 - tinderbox 2.6 running on freebsd-current.sentex.ca
TB --- 2010-02-26 21:41:33 - starting RELENG_8 tinderbox run for sparc64/sparc64
TB --- 2010-02-26 21:41:33 - cleaning the object tree
TB --- 2010-02-26 21:41:48 - cvsupping the source tree
TB --- 2010-02-26 21:41:48 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca 
/tinderbox/RELENG_8/sparc64/sparc64/supfile
TB --- 2010-02-26 21:42:27 - building world
TB --- 2010-02-26 21:42:27 - MAKEOBJDIRPREFIX=/obj
TB --- 2010-02-26 21:42:27 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
TB --- 2010-02-26 21:42:27 - TARGET=sparc64
TB --- 2010-02-26 21:42:27 - TARGET_ARCH=sparc64
TB --- 2010-02-26 21:42:27 - TZ=UTC
TB --- 2010-02-26 21:42:27 - __MAKE_CONF=/dev/null
TB --- 2010-02-26 21:42:27 - cd /src
TB --- 2010-02-26 21:42:27 - /usr/bin/make -B buildworld
>>> World build started on Fri Feb 26 21:42:28 UTC 2010
>>> Rebuilding the temporary build tree
>>> stage 1.1: legacy release compatibility shims
>>> stage 1.2: bootstrap tools
>>> stage 2.1: cleaning up the object tree
>>> stage 2.2: rebuilding the object tree
>>> stage 2.3: build tools
>>> stage 3: cross tools
>>> stage 4.1: building includes
>>> stage 4.2: building libraries
>>> stage 4.3: make dependencies
>>> stage 4.4: building everything
>>> World build completed on Fri Feb 26 22:33:52 UTC 2010
TB --- 2010-02-26 22:33:52 - generating LINT kernel config
TB --- 2010-02-26 22:33:52 - cd /src/sys/sparc64/conf
TB --- 2010-02-26 22:33:52 - /usr/bin/make -B LINT
TB --- 2010-02-26 22:33:52 - building LINT kernel
TB --- 2010-02-26 22:33:52 - MAKEOBJDIRPREFIX=/obj
TB --- 2010-02-26 22:33:52 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
TB --- 2010-02-26 22:33:52 - TARGET=sparc64
TB --- 2010-02-26 22:33:52 - TARGET_ARCH=sparc64
TB --- 2010-02-26 22:33:52 - TZ=UTC
TB --- 2010-02-26 22:33:52 - __MAKE_CONF=/dev/null
TB --- 2010-02-26 22:33:52 - cd /src
TB --- 2010-02-26 22:33:52 - /usr/bin/make -B buildkernel KERNCONF=LINT
>>> Kernel build for LINT started on Fri Feb 26 22:33:52 UTC 2010
>>> stage 1: configuring the kernel
>>> stage 2.1: cleaning up the object tree
>>> stage 2.2: rebuilding the object tree
>>> stage 2.3: build tools
>>> stage 3.1: making dependencies
>>> stage 3.2: building everything
[...]
:> export_syms
awk -f /src/sys/conf/kmod_syms.awk atapicam.kld  export_syms | xargs -J% 
objcopy % atapicam.kld
ld -Bshareable  -d -warn-common -o atapicam.ko atapicam.kld
objcopy --strip-debug atapicam.ko
===> ath (all)
cc -O2 -pipe -fno-strict-aliasing -Werror -D_KERNEL -DKLD_MODULE -nostdinc  -I. 
-I/src/sys/modules/ath/../../dev/ath 
-I/src/sys/modules/ath/../../dev/ath/ath_hal -DHAVE_KERNEL_OPTION_HEADERS 
-include /obj/sparc64/src/sys/LINT/opt_global.h -I. -I@ -I@/contrib/altq 
-finline-limit=15000 --param inline-unit-growth=100 --param 
large-function-growth=1000 -fno-common  -I/obj/sparc64/src/sys/LINT 
-mcmodel=medany -msoft-float -ffreestanding -fstack-protector -std=iso9899:1999 
-fstack-protector -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  
-Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef 
-Wno-pointer-sign -fformat-extensions -c 
/src/sys/modules/ath/../../dev/ath/if_ath.c
/src/sys/modules/ath/../../dev/ath/if_ath.c: In function 'ath_key_alloc':
/src/sys/modules/ath/../../dev/ath/if_ath.c:2240: error: expected expression 
before '/' token
*** Error code 1

Stop in /src/sys/modules/ath.
*** Error code 1

Stop in /src/sys/modules.
*** Error code 1

Stop in /obj/sparc64/src/sys/LINT.
*** Error code 1

Stop in /src.
*** Error code 1

Stop in /src.
TB --- 2010-02-26 22:45:44 - WARNING: /usr/bin/make returned exit code  1 
TB --- 2010-02-26 22:45:44 - ERROR: failed to build lint kernel
TB --- 2010-02-26 22:45:44 - 3187.37 user 550.32 system 3851.27 real


http://tinderbox.freebsd.org/tinderbox-releng_8-RELENG_8-sparc64-sparc64.full
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Panic on 8-STABLE in mpt(4) on a DELL PowerEdge R300

2010-02-26 Thread John J. Rushford

Thanks very much Alexander, I'll test t the patch this weekend.

John

Lorenzo Perone wrote:

COOL! THANKS a LOT Alexander!

Can't believe it. You post a panic at 11pm and get a patch at 1pm next 
day...? You must be crazy! ;)


Works for me. I patched against 8/stable. I'll be testing the machine 
a bit more. But for now, no panics!


I guess the patch should be committed soon, also because it's really 
happening quite at beginning, without a RAC/ILO you're locked out 
pretty fast, and mpt is used on many DELL/HP setups.


while true ; do echo Thank You ; done

Lorenzo

On 26.02.10 13:25, Alexander Motin wrote:

John J. Rushford wrote:

I'm running into the same problem, mpt(4) panic on FreeBSD 8-STABLE.

I'm running FreeBSD 8.0-STABLE, the current kernel was cvsup'd and 
built

@ January 14th, 2010.  I cvsup'd tonight, 2/25/2010, and built a new
kernel.  Attached is the panic when I tried to boot into single user
mode, I was able to boot up on the old kernel built on January 14th.

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address= 0x10
fault code= supervisor read data, page not present
instruction pointer= 0x20:0x8019c4bd
stack pointer= 0x28:0xff80e81d5ba0
frame pointer= 0x28:0xff80e81d5bd0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process= 6 (mpt_raid0)
trap number= 12
panic: page fault


Attached patch should fix the problem.







___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: mbuf leakage with nfs/zfs?

2010-02-26 Thread Gerrit Kühn
On Fri, 26 Feb 2010 23:12:39 +0100 Willem Jan Withagen 
wrote about Re: mbuf leakage with nfs/zfs?:

WJW> Mine are now:
WJW> 41533/2402/43935 mbufs in use (current/cache/total)
WJW> 41454/1572/43026/262144 mbuf clusters in use (current/cache/total/max)
WJW> 39241/823 mbuf+clusters out of packet secondary zone in use
WJW> (current/cache)

81492/2613/84105 mbufs in use (current/cache/total)
80467/2235/82702/128000 mbuf clusters in use (current/cache/total/max)
80458/822 mbuf+clusters out of packet secondary zone in use (current/cache)

If I keep increasing the clusters, maybe I can make it over the
weekend. :-)

WJW> ', I did set the zvol version this morning also to 14 but I think 
WJW> that I ran into trouble already when still running version 13.

Ok, so this is possibly ruled out, too. Maybe the Linux clients do
something weird?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: mbuf leakage with nfs/zfs?

2010-02-26 Thread Gerrit Kühn
On Fri, 26 Feb 2010 23:12:39 +0100 Willem Jan Withagen 
wrote about Re: mbuf leakage with nfs/zfs?:

WJW> > DB>  I'll have to do some packet snooping to check if it's TCP or
WJW> > DB> UDP nfs traffic, since some of the clients are Linux ...

WJW> > I have Linux clients, too. Some use tcp, some udp.

WJW> I have Linux and FreeBSD clients running. The build system runs on 
WJW> Linux. All Linux's are UDP

Another shot in the dark:
After upgrading the server, all my Linux clients hang with "stale nfs
dir/file handle/whatever". I was not able to umount them (not even
forcefully). I had to use either lazy forceful umount (-fl) or reboot. Some
of these clients are still hanging around, because they are physically
hard to access (clean room installs etc.). Maybe these clients still try to
establish connections that eat up the buffers and never come back?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: mbuf leakage with nfs/zfs?

2010-02-26 Thread Daniel Braniss
> On Fri, 26 Feb 2010 23:12:39 +0100 Willem Jan Withagen 
> wrote about Re: mbuf leakage with nfs/zfs?:
> 
> WJW> > DB>  I'll have to do some packet snooping to check if it's TCP or
> WJW> > DB> UDP nfs traffic, since some of the clients are Linux ...
> 
> WJW> > I have Linux clients, too. Some use tcp, some udp.
> 
> WJW> I have Linux and FreeBSD clients running. The build system runs on 
> WJW> Linux. All Linux's are UDP
> 
> Another shot in the dark:
> After upgrading the server, all my Linux clients hang with "stale nfs
> dir/file handle/whatever". I was not able to umount them (not even
> forcefully). I had to use either lazy forceful umount (-fl) or reboot. Some
> of these clients are still hanging around, because they are physically
> hard to access (clean room installs etc.). Maybe these clients still try to
> establish connections that eat up the buffers and never come back?

I doubt it, but here is another shot:
are we all running samba? I'm asking because the lock manager keeps dying and 
...

cheers,
danny
PS: I dropped Jack from the CC, I think em is innocent :-)

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"