Hi Jeremy,

Thanks for the very helpful response!

I added all debugging options that you specified to my kernel and rebuilt; then set the kernel parameters as you mention (I was being a bit lazy earlier when I called them sysctls; I always tuned them in loader.conf; just that you can view their values with sysctl).

Rebooted the system with the new kernel and set up a 11-disk zraid2 pool again then started beating on it. At first it seemed to be a bit more resilient with this set of kernel parameters but eventually it too failed out.

Again I just got a straight up reboot, no debugger, no output to the console flashed by as far as I can tell.

I don't have a serial console hooked up right now but it's probably possible to do so through the ILOM or equivalent; I will have to look into that further.

This is pretty wierd.

I am thinking there might be some memory starting to go in this system; never seen failing memory in an ECC box cause reboots this consistently and only under such specific conditions but I suppose it isn't completely out of the question. I'll talk to my customer and see what they can do about the hardware; maybe they have some spares.

I will also try 8.1-STABLE when I have a chance and see if that works better.

But it's definitely helpful to know that folks have > 9 disk raidz pools up and running on FreeBSD 8.x with no trouble - that it "should work". And the list of tunables is very useful; nice to have something to work with that I can have a bit more confidence in outside of my own guessing :)

I will report back to the list when I have more information.

Thanks!

-Sean


Quoting Jeremy Chadwick <free...@jdc.parodius.com>:

There are users here using FreeBSD ZFS with *lots* of disks (I think
someone was using 32 disks at one point) reliably.  Some of them post
here regularly (with other issues that don't consist of sporadic
reboots).

The kernel options may not be sufficient.  I'm used to using these:

# Debugging options
options         BREAK_TO_DEBUGGER       # Sending a serial BREAK drops to DDB
options         KDB                     # Enable kernel debugger support
options KDB_TRACE # Print stack trace automatically on panic
options         DDB                     # Support DDB
options         GDB                     # Support remote GDB

And in /etc/rc.conf, setting:

ddb_enable="yes"

Next: arc_max isn't "technically" a sysctl, meaning it can't be changed
in real-time, so I'm not sure how you managed to do that.  Validation:

sysctl: oid 'vfs.zfs.arc_max' is a read only tunable
sysctl: Tunable values are set in /boot/loader.conf

Your system may be reporting something relating to kmem exhaustion but
is then auto-rebooting so fast that you can't see the message on VGA
console.  Do you have serial console?

Please try setting the following tunables in /boot/loader.conf and
reboot the machine, then see if the same problem persists.

vm.kmem_size="16384M"
vfs.zfs.arc_max="14336M"
vfs.zfs.prefetch_disable="1"
vfs.zfs.zio.use_uma="0"
vfs.zfs.txg.timeout="5"

I would also advocate you try 8.1-STABLE as there have been many changes
in ZFS since then (and I'm not just referring to the v15 import),
including how the ARC gets sized/adjusted.  CURRENT is highly
bleeding-edge, so I would start or stick with STABLE.

Finally, there's always the possibility that the PSU has some sort of
load problem with that many disks all being accessed at the same time.
I imagine the power draw of that system is quite high.  I can't imagine
Sun shipping a box with a insufficient PSU, but then again power draw
changes depending on the RPM of the disks used and many other things.

--
| Jeremy Chadwick                                   j...@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"





_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to