----- Original Message ----- > From: "Kazunori INOUE" <inouek...@intellilink.co.jp> > To: "pacemaker@oss" <pacemaker@oss.clusterlabs.org> > Cc: shimaza...@intellilink.co.jp > Sent: Wednesday, November 28, 2012 2:54:56 AM > Subject: [Pacemaker] crm_mon -W crash > > Hi, > > I try to handle SNMP trap with crm_mon. > However, crm_mon crashes with SIGSEGV at the time of fencing. > > [environment] > - Red Hat Enterprise Linux Server release 6.3 (Santiago) > - ClusterLabs/pacemaker 9c13d14640(Nov 27, 2012) > - corosync 92e0f9c7bb(Nov 07, 2012) > > [root@dev1 ~]$ pacemakerd -F > Pacemaker 1.1.8 (Build: 9c13d14) > Supporting: generated-manpages agent-manpages ascii-docs > publican-docs ncurses libqb-logging libqb-ipc lha-fencing > corosync-native snmp > > [root@dev1 ~]$ rpm -qi net-snmp-libs > Name : net-snmp-libs Relocations: (not > relocatable) > Version : 5.5 Vendor: Red Hat, > Inc. > Release : 41.el6 Build Date: Fri May 18 > 19:20:24 2012 > Install Date: Mon Jul 2 14:15:53 2012 Build Host: > x86-003.build.bos.redhat.com > -snip- > > > [test case] > 1. set only STONITH resources and perform crm_mon with -W option. > > [root@dev1 ~]$ crm_mon -S 192.168.133.148 -W > Last updated: Wed Nov 28 11:44:46 2012 > Last change: Wed Nov 28 11:44:35 2012 via cibadmin on dev1 > Stack: corosync > Current DC: dev1 (2506467520) - partition with quorum > Version: 1.1.8-9c13d14 > 2 Nodes configured, unknown expected votes > 2 Resources configured. > > > Online: [ dev1 dev2 ] > > prmStonith1 (stonith:external/libvirt): Started dev2 > prmStonith2 (stonith:external/libvirt): Started dev1 > > 2. fence the dev2. > > [root@dev1 ~]$ crm node fence dev2 > Do you really want to shoot dev2? y > > 3. then, crm_mon crashed. > > [root@dev1 ~]$ crm_mon -S 192.168.133.148 -W > Last updated: Wed Nov 28 11:45:32 2012 > Last change: Wed Nov 28 11:44:35 2012 via cibadmin on dev1 > Stack: corosync > Current DC: dev1 (2506467520) - partition WITHOUT quorum > Version: 1.1.8-9c13d14 > 2 Nodes configured, unknown expected votes > 2 Resources configured. > > > Node dev2 (2472913088): UNCLEAN (offline) > Online: [ dev1 ] > > prmStonith1 (stonith:external/libvirt): Started dev2 > prmStonith2 (stonith:external/libvirt): Started dev1 > Segmentation fault (core dumped) > [root@dev1 ~]$ > > > GDB shows this: > [root@dev1 ~]$ gdb `which crm_mon` core.28326 > GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6) > -snip- > Core was generated by `crm_mon -S 192.168.133.148 -W'. > Program terminated with signal 11, Segmentation fault. > #0 0x0000003f808805a1 in __strlen_sse2 () from /lib64/libc.so.6 > -snip- > (gdb) bt > #0 0x0000003f808805a1 in __strlen_sse2 () from /lib64/libc.so.6 > #1 0x0000003f81c39481 in snmp_add_var () from > /usr/lib64/libnetsnmp.so.20 > #2 0x00000000004099ba in send_snmp_trap (node=0x24408a0 "dev2", > rsc=0x0, task=0x245c850 "st_notify_fence", target_rc=0, rc=0, > status=0, desc=0x2462b80 "Operation st_notify_fence requested by > dev1 for peer dev2: OK > (ref=c520e07b-907f-48b9-a216-4786289b61da)") at crm_mon.c:1716 > #3 0x000000000040af6b in mon_st_callback (st=0x2409520, > e=0x243aa30) at crm_mon.c:2241 > #4 0x00007fc23639598d in stonith_send_notification > (data=0x2408390, user_data=0x7fff159e1410) at st_client.c:1960 > #5 0x000000364263688c in g_list_foreach () from > /lib64/libglib-2.0.so.0 > #6 0x00007fc236396638 in stonith_dispatch_internal > (buffer=0x2429a08 "<notify t=\"st_notify\" > subt=\"st_notify_fence\" st_op=\"st_notify_fence\" > st_rc=\"0\"><st_calldata><st_notify_fence state=\"2\" st_rc=\"0\" > st_target=\"dev2\" st_device_action=\"reboot\" > st_delegate=\"dev1\" st_remote"..., length=387, > userdata=0x2409520) at st_client.c:2128 > #7 0x00007fc235d03391 in mainloop_gio_callback (gio=0x2433fe0, > condition=G_IO_IN, data=0x240a2e0) at mainloop.c:565 > #8 0x0000003642638f0e in g_main_context_dispatch () from > /lib64/libglib-2.0.so.0 > #9 0x000000364263c938 in ?? () from /lib64/libglib-2.0.so.0 > #10 0x000000364263cd55 in g_main_loop_run () from > /lib64/libglib-2.0.so.0 > #11 0x0000000000404e23 in main (argc=4, argv=0x7fff159e1778) at > crm_mon.c:590 > (gdb)
Looking at the back trace, this might fix it. diff --git a/tools/crm_mon.c b/tools/crm_mon.c index 2e2ca16..5c2e687 100644 --- a/tools/crm_mon.c +++ b/tools/crm_mon.c @@ -1713,7 +1713,9 @@ send_snmp_trap(const char *node, const char *rsc, const char *task, int target_r } /* Add extries to the trap */ - add_snmp_field(trap_pdu, snmp_crm_oid_rsc, rsc); + if (rsc) { + add_snmp_field(trap_pdu, snmp_crm_oid_rsc, rsc); + } add_snmp_field(trap_pdu, snmp_crm_oid_node, node); add_snmp_field(trap_pdu, snmp_crm_oid_task, task); add_snmp_field(trap_pdu, snmp_crm_oid_desc, desc); > Is this a known issue? > > Best Regards, > Kazunori INOUE > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org