Hi Steve, thanks a lot for your help!
The problem is that the issues that occur are different at different
bootups. Since the beginning of this year this computer/server has been
started up and shut down a bit over 200 times, where this error occurred
5 times including today. This means that the likelyhood for this
particular error to occur is way less than 2.5%, so the odds are against
me if I want to try locating the source of this error which is not
likely to happen again anytime soon. So in order for mdb to be usefeul
it must be possible to run it at every startup without interfering with
other functions of the system.
I was also playing around with svccfg and by using svccfg -s system/hal
and issuing different commands in the svccfg shell. From the
documentation it seems that you can "inject" some specific debuggers
using the "setenv" command. I'm at wits end here but at least it's not
saying "can't do that Dave", yet.
Robin.
On 2011-09-14 16:05, Steve Gonczi wrote:
Hello,
Looking at the Hald source: ( usr/src/cmd/hal/hald /hald.c)
Error 95 is coming from a script, ti is just informing you that a fatal error
occurred.
The informative error code is the "2".
This tells you that hald forked a child process, and it timed out
waiting for the child process to write to a pipe.
The child process hung or failed for some reason, and the parent decided to
kill it. The child code could hang for a number of reasons.
One possible way to debug this is load mdb so that it breaks early in the boot,
set a breakpoints on some of the processing steps, like
hald_dbus_local_server_init , ospec_init ettc.. and similar processing steps to
narrow down where it hangs.
I see from the source that Hald has fairly detailed built-in logging that may
help
debugging this.
If the environment variables HALD_VERBOSE and HALD_USE_SYSLOG are defined,
you should get detailed status messages.
There is probably a man page somewhere on how to set these.
Said log settings can also be modified via hald command line options
( Sorry, I have no idea what script or setup file you have to hack to
specify these on startup):
static void 210 usage () 211 { 212 fprintf ( stderr , "\n" "usage : hald [--daemon=yes|no] [--verbose=yes|no] [--help]\n" ); 213 fprintf ( stderr , 214 "\n" 215 "
--daemon=yes|no Become a daemon\n" 216 " --verbose=yes|no Print out debug (overrides HALD_VERBOSE)\n" 217 " --use-syslog Print out debug messages to syslog instead of
stderr.\n" 218 " Use this option to get debug messages if HAL runs as\n" 219 " daemon.\n" 220 " --help Show this
information and exit\n" 221 " --version Output version information and exit" 222 "\n" 223 "The HAL daemon detects devices present in the system and provides the\n"
224 "org.freedesktop.Hal service through the system-wide message bus provided\n" 225 "by D-BUS.\n"
Steve
----- Original Message -----
Hi,
I'm about to RMA my motherboard but before that I want to troubleshoot
the issue further so that I can give more specific information on what's
failing on the motherboard.
What happens is that some hardware is failing on the motherboard which
causes OI to hang during boot. So my question is how can I find out what
hardware is failing? The problem is that when I reset the system it
boots up just fine after the reset and e.g. the svcs -xv gives no
information on failures on last boot. These issues also don't happen
every time I start up the system, it happens rather sporadically.
Here's what I found out; when it freezes, the last lines of the console
looks like this:
_______________________________________________
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss
.
_______________________________________________
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss