Re: Page fault, GEOM problem??

Johan Ström Fri, 18 Nov 2005 10:23:17 -0800

Hi!

On 18 nov 2005, at 18.43, Xin LI wrote:

Hi, Johan,

On 11/18/05, Johan Ström <[EMAIL PROTECTED]> wrote:

On 18 nov 2005, at 10.17, Xin LI wrote:

[snip]

Doesnt look like I got any "usable" dump devices..
When booting i get

[...]

Loading configuration files.
No suitable dump device was found.
Entropy harvesting:
interrupts
ethernet
point_to_point
kickstart
.
swapon: adding /dev/mirror/gm0s1b as swap device


I see, so your both SATA disks are in the same mirror group...

Then naturally:
/etc/rc: WARNING: Dump device does not exist.  Savecore not run.

Looked around in the rc-scripts and tried to figure out what it did,
the dumpon script
tries to autolookup a good dump device but finds none..


Unfortunately, kernel dumps currently does not support every device,
for some technical reasons (probably to simplify the crash code so
they do not make more mistakes^Wdamages)

According to the page you linked to, the dumpon command has to be
executed AFTER swapon.. Why is the rc scripts trying to run it before
swapon then?


I guess this is because that dumpon now can detect dump device
automatically, but I'm not quite sure about this.  Will look for the
reason.  I think either Handbook should be updated, or the code should
be corrected.

What I am very curious is that why dumpon is "BEFORE" savecore.  Maybe
I have some misunderstanding...

Sorry, partly my misstake.. I think i missunderstod how save savecoreworks below (when i tried it manually in last mail)..But the messages from above are directly from boot, seems it triesdumpon before savecore? Relevant bootlog from last boot:



ad0: 2441MB <WDC AC22500L 32.41N35> at ata0-master UDMA33
acd0: CDROM <CD-ROM CDU701-F/1.0q> at ata1-master PIO4
ad6: 286188MB <Maxtor 7L300S0 BANC1G10> at ata3-master SATA150
ad10: 286188MB <Maxtor 7L300S0 BANC1G10> at ata5-master SATA150
GEOM_MIRROR: Device gm0s1 created (id=4118114647).
GEOM_MIRROR: Device gm0s1: provider ad6s1 detected.
GEOM_MIRROR: Device gm0s1: provider ad10s1 detected.
GEOM_MIRROR: Device gm0s1: provider ad10s1 activated.
GEOM_MIRROR: Device gm0s1: provider ad6s1 activated.
GEOM_MIRROR: Device gm0s1: provider mirror/gm0s1 launched.
Trying to mount root from ufs:/dev/mirror/gm0s1a
Loading configuration files.

dumpon: (this DIOCSKERNELDUMP message is probably since i specifieddumpdev in rc.conf so it forced useage of gm0s1b instead of lettingthe scripts autodetect.. )

ioctl(DIOCSKERNELDUMP)
:
Operation not supported
Entropy harvesting:
interrupts
ethernet
point_to_point
kickstart
.
swapon: adding /dev/mirror/gm0s1b as swap device
Starting file system checks:
/dev/mirror/gm0s1a: FILE SYSTEM CLEAN; SKIPPING CHECKS

/dev/mirror/gm0s1a: clean, 213811 free (771 frags, 26630 blocks, 0.3%fragmentation)

/dev/mirror/gm0s1e: FILE SYSTEM CLEAN; SKIPPING CHECKS

/dev/mirror/gm0s1e: clean, 1012917 free (85 frags, 126604 blocks,0.0% fragmentation)

/dev/mirror/gm0s1f: FILE SYSTEM CLEAN; SKIPPING CHECKS

/dev/mirror/gm0s1f: clean, 115955787 free (40747 frags, 14489380blocks, 0.0% fragmentation)

/dev/mirror/gm0s1d: FILE SYSTEM CLEAN; SKIPPING CHECKS

/dev/mirror/gm0s1d: clean, 1983354 free (4834 frags, 247315 blocks,0.2% fragmentation)

<ifconfig stuff>
Starting devd.
Mounting NFS file systems:
.
Creating and/or trimming log files:
.
Starting syslogd.
Checking for core dump on /dev/mirror/gm0s1b...
savecore: no dumps found
Starting named.
<rest of boot>

So, it seems it does run savecore after running dumpon and mountingdisks etc... Is that wrong?

Anyway, tried to do dumpon manually on my swap drive:

$ dumpon -v /dev/mirror/gm0s1b
dumpon: ioctl(DIOCSKERNELDUMP): Operation not supported

Didn't work too good..
Also tried savecore manually:

$ savecore /var/crash/ /dev/mirror/gm0s1b
savecore: no dumps found

(This was my misstake, of course there are no dumps when I didnt havea dump when it crashed..)


Didnt work very good either (but probably expected since there was no
working dumps..)
Google showed me some other thread in this list about gmirror swap
dump, just a question (if it was supported) w/o any answers tho. Same
error as I got.


It seems that this could not be workaround'ed easily.  If possible, my
suggestion is that you attach a third disk and create a swap partition
on it for the crash dump.  If this is not feasible, then adding DDB
and KDB may give us a chance to catch the panic and you can use
"trace" command at the ddb> prompt to obtain a simplified backtrace,
and there is good chance that it would reveal what is happening.

I have cc'ed to Pawel who is very knowledgeable in this area, and
let's see whether he has some better suggestions :-)

Okay, just added an old but working 2 gig disk to the system, made ita swap and swapon'ed and:


[EMAIL PROTECTED]:~$ dumpon -v /dev/ad0s1b
kernel dumps on /dev/ad0s1b

Great! :) So, let's see when/if it dies next time... Before I took itdown for the dump-disk, it had been running finefor 1d 1h (since boot after crasch), however probably not as loadedas the day it crashed.. I'll try to load it some now and see if itcrashes.


Thanks

Johan


Cheers,
--
Xin LI <[EMAIL PROTECTED]> http://www.delphij.net


_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Page fault, GEOM problem??

Reply via email to