I start the cp, and then, with prstat -a, watch the cpu load for the cp process climb to 25% on a 4-core machine.
Load, measured for example with 'uptime', climbs steadily until the reboot. Note that the machine does not dump properly, panic or hang - rather, it reboots. I attached a screenshot earlier in this thread of the little bit of error message I could see on the console. The machine is trying to dump to the dump zvol, but fails to do so. Only sometimes do I see an error on the machine's local console - mos times, it simply reboots. On Thu, Mar 12, 2009 at 1:55 AM, Nathan Kroenert <nathan.kroen...@sun.com> wrote: > Hm - > > Crashes, or hangs? Moreover - how do you know a CPU is pegged? > > Seems like we could do a little more discovery on what the actual problem > here is, as I can read it about 4 different ways. > > By this last piece of information, I'm guessing the system does not crash, > but goes really really slow?? > > Crash == panic == we see stack dump on console and try to take a dump > hang == nothing works == no response -> might be worth looking at mdb -K > or booting with a -k on the boot line. > > So - are we crashing, hanging, or something different? > > It might simply be that you are eating up all your memory, and your physical > backing storage is taking a while to catch up....? > > Nathan. > > Blake wrote: >> >> My dump device is already on a different controller - the motherboards >> built-in nVidia SATA controller. >> >> The raidz2 vdev is the one I'm having trouble with (copying the same >> files to the mirrored rpool on the nVidia controller work nicely). I >> do notice that, when using cp to copy the files to the raidz2 pool, >> load on the machine climbs steadily until the crash, and one proc core >> pegs at 100%. >> >> Frustrating, yes. >> >> On Thu, Mar 12, 2009 at 12:31 AM, Maidak Alexander J >> <maidakalexand...@johndeere.com> wrote: >>> >>> If you're having issues with a disk contoller or disk IO driver its >>> highly likely that a savecore to disk after the panic will fail. I'm not >>> sure how to work around this, maybe a dedicated dump device not on a >>> controller that uses a different driver then the one that you're having >>> issues with? >>> >>> -----Original Message----- >>> From: zfs-discuss-boun...@opensolaris.org >>> [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Blake >>> Sent: Wednesday, March 11, 2009 4:45 PM >>> To: Richard Elling >>> Cc: Marc Bevand; zfs-discuss@opensolaris.org >>> Subject: Re: [zfs-discuss] reboot when copying large amounts of data >>> >>> I guess I didn't make it clear that I had already tried using savecore to >>> retrieve the core from the dump device. >>> >>> I added a larger zvol for dump, to make sure that I wasn't running out of >>> space on the dump device: >>> >>> r...@host:~# dumpadm >>> Dump content: kernel pages >>> Dump device: /dev/zvol/dsk/rpool/bigdump (dedicated) Savecore >>> directory: /var/crash/host >>> Savecore enabled: yes >>> >>> I was using the -L option only to try to get some idea of why the system >>> load was climbing to 1 during a simple file copy. >>> >>> >>> >>> On Wed, Mar 11, 2009 at 4:58 PM, Richard Elling >>> <richard.ell...@gmail.com> wrote: >>>> >>>> Blake wrote: >>>>> >>>>> I'm attaching a screenshot of the console just before reboot. The >>>>> dump doesn't seem to be working, or savecore isn't working. >>>>> >>>>> On Wed, Mar 11, 2009 at 11:33 AM, Blake <blake.ir...@gmail.com> wrote: >>>>> >>>>>> I'm working on testing this some more by doing a savecore -L right >>>>>> after I start the copy. >>>>>> >>>>>> >>>> savecore -L is not what you want. >>>> >>>> By default, for OpenSolaris, savecore on boot is disabled. But the >>>> core will have been dumped into the dump slice, which is not used for >>>> swap. >>>> So you should be able to run savecore at a later time to collect the >>>> core from the last dump. >>>> -- richard >>>> >>>> >>> _______________________________________________ >>> zfs-discuss mailing list >>> zfs-discuss@opensolaris.org >>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >>> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > -- > ////////////////////////////////////////////////////////////////// > // Nathan Kroenert nathan.kroen...@sun.com // > // Systems Engineer Phone: +61 3 9869-6255 // > // Sun Microsystems Fax: +61 3 9869-6288 // > // Level 7, 476 St. Kilda Road Mobile: 0419 305 456 // > // Melbourne 3004 Victoria Australia // > ////////////////////////////////////////////////////////////////// > _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss