I've had no response on my problem, is there anybody who can help me on this?
Morten K. Tlf: +47 76 16 61 81 | Mob: +47 906 52 903 Kvalitet - Trygghet - Respekt From: ocfs2-users-boun...@oss.oracle.com [mailto:ocfs2-users-boun...@oss.oracle.com] On Behalf Of Kristiansen Morten Sent: 21. mars 2013 14:47 To: ocfs2-users@oss.oracle.com Subject: [Ocfs2-users] Shutting down one node caused all the other nodes to shutdown aswell. Hi, We are running a 8 nodes cluster on RHEL 2.6.18-128 64-bit. Yesterday the server/san guys exchanged the ocfs2 disks to another SAN, by mirroring and synchronizing the disks. When they rebooted the servers, one of the nodes, tos-dipsprod-07 wasn't able to start Oracle Grid Infrastructure, the voting disk was not found. Then we tried to reboot that node, causing all nodes to reboot. Time round about 02:25. When examine the /var/log/messages I discovered a BUG message on one of the node that rebooted unexpectedly, tos-dipsprod-02. I've tried to google it, but I couldn't find any solution. Is this a well known bug? Does any body have a solution to this problem? Below is a extract of o2net and ocfs2 messages from the /var/log/message file. /var/log/messages til tos-dipsprod-07: Mar 21 02:08:49 tos-dipsprod-07 kernel: o2net: connection to node tos-dipsprod-06 (num 3) at 192.168.7.105:7777 has been idle for 10.0 seconds, shutting it down. Mar 21 02:25:25 tos-dipsprod-07 kernel: o2net: connection to node tos-dipsprod-01 (num 0) at 192.168.7.100:7777 has been idle for 10.0 seconds, shutting it down. Mar 21 02:25:35 tos-dipsprod-07 kernel: o2net: connection to node tos-dipsprod-02 (num 1) at 192.168.7.101:7777 has been idle for 10.0 seconds, shutting it down. Mar 21 02:25:40 tos-dipsprod-07 kernel: o2net: connection to node tos-dipsprod-03 (num 2) at 192.168.7.102:7777 has been idle for 10.0 seconds, shutting it down. Mar 21 02:25:45 tos-dipsprod-07 kernel: o2net: connection to node tos-dipsprod-06 (num 3) at 192.168.7.105:7777 has been idle for 10.0 seconds, shutting it down. Mar 21 02:25:54 tos-dipsprod-07 kernel: o2net: connection to node tos-dipsprod-04 (num 5) at 192.168.7.103:7777 has been idle for 10.0 seconds, shutting it down. Mar 21 04:03:17 tos-dipsprod-07 kernel: o2net: connection to node tos-dipsprod-06 (num 3) at 192.168.7.105:7777 has been idle for 10.0 seconds, shutting it down. Mar 21 04:06:32 tos-dipsprod-07 kernel: o2net: connection to node tos-dipsprod-01 (num 0) at 192.168.7.100:7777 has been idle for 10.0 seconds, shutting it down. Mar 21 04:06:37 tos-dipsprod-07 kernel: o2net: connection to node tos-dipsprod-02 (num 1) at 192.168.7.101:7777 has been idle for 10.0 seconds, shutting it down. Mar 21 04:06:47 tos-dipsprod-07 kernel: o2net: connection to node tos-dipsprod-03 (num 2) at 192.168.7.102:7777 has been idle for 10.0 seconds, shutting it down. Mar 21 06:04:25 tos-dipsprod-07 kernel: o2net: connection to node tos-dipsprod-02 (num 1) at 192.168.7.101:7777 has been idle for 10.0 seconds, shutting it down. Og her fra tos-dipsprod-02: 10474-Mar 21 02:25:15 tos-dipsprod-02 kernel: (o2net,7452,5):dlm_begin_reco_handler:2730 992D008CD522447C8333FC34BD46F8CD: dead_node previously set to 7, node 3 changing it to 7 10646-Mar 21 02:25:25 tos-dipsprod-02 kernel: (o2net,7452,5):dlm_finalize_reco_handler:2839 ERROR: node 6 sent recovery finalize msg, but node 3 is supposed to be the new master, dead=7 10826:Mar 21 02:25:25 tos-dipsprod-02 kernel: Kernel BUG at ...shran/BUILD/ocfs2-1.4.7/fs/ocfs2/dlm/dlmrecovery.c:2840 10939-Mar 21 02:43:01 tos-dipsprod-02 syslogd 1.4.1: restart. 10995-Mar 21 02:43:02 tos-dipsprod-02 modprobe: FATAL: Module ocfs2_stackglue not found. -- 17537-Mar 21 04:06:19 tos-dipsprod-02 kernel: (o2net,7472,1):dlm_begin_reco_handler:2730 992D008CD522447C8333FC34BD46F8CD: dead_node previously set to 6, node 6 changing it to 7 17709-Mar 21 04:06:29 tos-dipsprod-02 kernel: (o2net,7472,1):dlm_finalize_reco_handler:2839 ERROR: node 6 sent recovery finalize msg, but node 255 is supposed to be the new master, dead=7 17891:Mar 21 04:06:29 tos-dipsprod-02 kernel: Kernel BUG at ...shran/BUILD/ocfs2-1.4.7/fs/ocfs2/dlm/dlmrecovery.c:2840 18004-Mar 21 04:38:04 tos-dipsprod-02 syslogd 1.4.1: restart. 18060-Mar 21 04:41:33 tos-dipsprod-02 modprobe: FATAL: Module ocfs2_stackglue not found. Morten Kristiansen | Counsellor Helse Nord IKT | Departement of Serviceproduction Tlf: +47 76 16 61 81 | Mob: +47 906 52 903 Office address: Amtmann Worsøes gate 63, 8012 Bodø, Norway Quality - Safety - Respect
_______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-users