>>> Andreas Kurz <[email protected]> schrieb am 19.12.2011 um 12:02 in >>> Nachricht <[email protected]>: > On 12/19/2011 09:15 AM, Ulrich Windl wrote: >>>>> Andreas Kurz <[email protected]> schrieb am 16.12.2011 um 14:01 in >>>>> Nachricht > > <[email protected]>: > >> Hello Ulrich, > >> > >> On 12/16/2011 01:31 PM, Ulrich Windl wrote: > >>> Hi! > >>> > >>> I have some troubel with OCFS on top of DRBD that seems to be > >> timing-related: > >>> OCFS is working on the DRBD when DRBD itself wants to vhange something it > >> seems: > >> > >> can we see your cib and your full drbd cofniguration please ... > > > > It's somewhat complex, and I may not show you everything, sorry for that. > > no problem ... you asked for help on a public mailing-list ... > > > > >> > >>> > >>> ... > >>> Dec 16 11:39:58 h06 kernel: [ 122.426174] block drbd0: role( Secondary > >>> -> > >> Primary ) > >>> Dec 16 11:39:58 h06 multipathd: drbd0: update path write_protect to '0' > >> (uevent) > >>> Dec 16 11:40:29 h06 ocfs2_controld: start_mount: uuid > >> "FD32E504527742CEA7DA6DB272D5D7B2", device "/dev/drbd_r0", service "ocfs2" > >>> ... > >>> Dec 16 11:40:29 h06 kernel: [ 152.837615] block drbd0: peer( Secondary > >>> -> > >> Primary ) > >>> Dec 16 11:40:29 h06 ocfs2_hb_ctl[19177]: ocfs2_hb_ctl /sbin/ocfs2_hb_ctl > >>> -P > >> -d /dev/drbd_r0 > >>> Dec 16 11:43:50 h06 kernel: [ 354.559240] block drbd0: State change > >> failed: Device is held open by someone > >>> Dec 16 11:43:50 h06 kernel: [ 354.559244] block drbd0: state = { > >> cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate r----- } > >>> Dec 16 11:43:50 h06 kernel: [ 354.559246] block drbd0: wanted = { > >> cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate r----- } > >>> Dec 16 11:43:50 h06 drbd[28754]: [28786]: ERROR: r0: Called drbdadm -c > >> /etc/drbd.conf secondary r0 > >>> Dec 16 11:43:50 h06 drbd[28754]: [28789]: ERROR: r0: Exit code 11 > >>> > >>> A little bit later DRBD did it's own fencing (the machine rebooted) > >> > >> do you have logs to confirm this? > > > > Naturally no, as the commands "echo b > /proc/sysrq-trigger ; reboot -f" > > don't > actually write nice log messages. > > All those nice drbd notify scripts do send mails, at least to local root > account. Additionally they try to log via syslog as well as DRBD does on > executing the handler ... so you have a good chance to get some > information if DRBD triggers that reboot ... at least if you are doing > remote syslogging.
I examined "notify-io-error.sh": It's trying to log a syslog message and send mail. However as writing to disk and sending mails is both asynchronous, there are little chances that anything will make it to disk before "echo b > /proc/sysrq-trigger" becomes active. Unless it does more damage, I'd strongly recommend doing a "sync" before that. Is it nice to have a quick reboot, or is it absolutely necessary? Regards, Ulrich _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
