Hii, Is pacmaker also work on WAN, if yes then how????
Thank you On Mon, Sep 3, 2012 at 3:30 PM, <pacemaker-requ...@oss.clusterlabs.org>wrote: > Send Pacemaker mailing list submissions to > pacemaker@oss.clusterlabs.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > or, via email, send a message with subject or body 'help' to > pacemaker-requ...@oss.clusterlabs.org > > You can reach the person managing the list at > pacemaker-ow...@oss.clusterlabs.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Pacemaker digest..." > > > Today's Topics: > > 1. Pacemaker 1.1.6 order possible bug ? (Tom?? Vav?i?ka) > 2. Two c72f5ca stonithd coredumps (Vladislav Bogdanov) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 03 Sep 2012 07:41:25 +0200 > From: Tom?? Vav?i?ka <vavri...@ttc.cz> > To: pacemaker@oss.clusterlabs.org > Subject: [Pacemaker] Pacemaker 1.1.6 order possible bug ? > Message-ID: <50444305.4090...@ttc.cz> > Content-Type: text/plain; charset=UTF-8; format=flowed > > Hello, > > Sorry If I send same question twice, but message did not appeared on > mailing list. > > I have a problem with orders in pacemaker 1.1.6 and corosync 1.4.1. > > Order below is working for failover, but it is not working when one > cluster node starts up (drbd stays in Slave state and ms_toponet is > started before DRBD gets promoted). > > order o_start inf: ms_drbd_postgres:promote postgres:start > ms_toponet:promote monitor_cluster:start > > Order below is not working for failover (it kills slave toponet app and > start it again) but it is working correctly when cluster starts up. > > order o_start inf: ms_drbd_postgres:promote postgres:start > ms_toponet:start ms_toponet:promote monitor_cluster:start > > I want to the pacemaker to act as in 1.0.12 version. > * when toponet master app is killed, move postgres resource to other > node and promote ms_toponet and ms_drbd_postgres to Master > * when one node is starting promote DRBD to master is is UpToDate > > Am I doing something wrong? > > It looks to me pacemaker ignores some orders (pacemaker should wait for > DRBD promotion when starting toponet app, but toponet app is started > right after DRBD start (slave)). I tried to solve this by different > orders with combination symmetrical=false, split orders, different > orders for start and stop, but no success at all (seems to me like > completely ignoring symmetrical=false directive). > > Pacemaker 1.1.7 is not working for me, because it has broken on-fail > directive. > > crm_mon output: > > ============ > Last updated: Fri Aug 31 14:51:11 2012 > Last change: Fri Aug 31 14:50:27 2012 by hacluster via crmd on toponet30 > Stack: openais > Current DC: toponet30 - partition WITHOUT quorum > Version: 1.1.6-b988976485d15cb702c9307df55512d323831a5e > 2 Nodes configured, 2 expected votes > 10 Resources configured. > ============ > > Online: [ toponet30 toponet31 ] > > st_primary (stonith:external/xen0): Started toponet30 > st_secondary (stonith:external/xen0): Started toponet31 > Master/Slave Set: ms_drbd_postgres > Masters: [ toponet30 ] > Slaves: [ toponet31 ] > Resource Group: postgres > pg_fs (ocf::heartbeat:Filesystem): Started toponet30 > PGIP (ocf::heartbeat:IPaddr2): Started toponet30 > postgresql (ocf::heartbeat:pgsql): Started toponet30 > monitor_cluster (ocf::heartbeat:monitor_cluster): Started toponet30 > Master/Slave Set: ms_toponet > Masters: [ toponet30 ] > Slaves: [ toponet31 ] > > configuration: > > node toponet30 > node toponet31 > primitive PGIP ocf:heartbeat:IPaddr2 \ > params ip="192.168.100.3" cidr_netmask="29" \ > op monitor interval="5s" > primitive drbd_postgres ocf:linbit:drbd \ > params drbd_resource="postgres" \ > op start interval="0" timeout="240s" \ > op stop interval="0" timeout="120s" \ > op monitor interval="5s" role="Master" timeout="10s" \ > op monitor interval="10s" role="Slave" timeout="20s" > primitive monitor_cluster ocf:heartbeat:monitor_cluster \ > op monitor interval="30s" \ > op start interval="0" timeout="30s" \ > meta target-role="Started" > primitive pg_fs ocf:heartbeat:Filesystem \ > params device="/dev/drbd0" directory="/var/lib/pgsql" > fstype="ext3" > primitive postgresql ocf:heartbeat:pgsql \ > op start interval="0" timeout="80s" \ > op stop interval="0" timeout="60s" \ > op monitor interval="10s" timeout="10s" depth="0" > primitive st_primary stonith:external/xen0 \ > op start interval="0" timeout="60s" \ > params hostlist="toponet31:/etc/xen/vm/toponet31" > dom0="172.16.103.54" > primitive st_secondary stonith:external/xen0 \ > op start interval="0" timeout="60s" \ > params hostlist="toponet30:/etc/xen/vm/toponet30" > dom0="172.16.103.54" > primitive toponet ocf:heartbeat:toponet \ > op start interval="0" timeout="180s" \ > op stop interval="0" timeout="60s" \ > op monitor interval="10s" role="Master" timeout="20s" > on-fail="standby" \ > op monitor interval="20s" role="Slave" timeout="40s" \ > op promote interval="0" timeout="120s" \ > op demote interval="0" timeout="120s" > group postgres pg_fs PGIP postgresql > ms ms_drbd_postgres drbd_postgres \ > meta master-max="1" master-node-max="1" clone-max="2" > clone-node-max="1" notify="true" target-role="Master" > ms ms_toponet toponet \ > meta master-max="1" master-node-max="1" clone-max="2" > clone-node-max="1" target-role="Master" > location loc_st_pri st_primary -inf: toponet31 > location loc_st_sec st_secondary -inf: toponet30 > location master-prefer-node1 postgres 100: toponet30 > colocation pg_on_drbd inf: monitor_cluster ms_toponet:Master postgres > ms_drbd_postgres:Master > order o_start inf: ms_drbd_postgres:start ms_drbd_postgres:promote > postgres:start ms_toponet:start ms_toponet:promote monitor_cluster:start > property $id="cib-bootstrap-options" \ > dc-version="1.1.6-b988976485d15cb702c9307df55512d323831a5e" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > no-quorum-policy="ignore" \ > stonith-enabled="true" > rsc_defaults $id="rsc-options" \ > resource-stickiness="5000" > > > > ------------------------------ > > Message: 2 > Date: Mon, 03 Sep 2012 10:03:28 +0300 > From: Vladislav Bogdanov <bub...@hoster-ok.com> > To: pacemaker@oss.clusterlabs.org > Subject: [Pacemaker] Two c72f5ca stonithd coredumps > Message-ID: <50445640.1060...@hoster-ok.com> > Content-Type: text/plain; charset=UTF-8 > > Hi Andrew, all, > > as I wrote before, I caught two paths where stonithd (c72f5ca) dumps core. > Here are gdb backtraces for them (sorry for posting them inline, I was > requested to do that ASAP and I hope it is not yet too late for 1.1.8 ;) > ). Some vars are optimized out, but I hope that doesn't matter. If some > more information is needed please just request it. > > First one is: > ... > Core was generated by `/usr/libexec/pacemaker/stonithd'. > Program terminated with signal 11, Segmentation fault. > ... > (gdb) bt > #0 0x00007f4aec6cdb51 in __strlen_sse2 () from /lib64/libc.so.6 > #1 0x00007f4aec6cd866 in strdup () from /lib64/libc.so.6 > #2 0x000000000040c6f6 in create_remote_stonith_op (client=0x1871120 > "2194f1b8-5722-49c3-bed1-c8fecc78ca02", request=0x1884840, peer=<value > optimized out>) > at remote.c:313 > #3 0x000000000040cf40 in initiate_remote_stonith_op (client=<value > optimized out>, request=0x1884840, manual_ack=0) at remote.c:336 > #4 0x000000000040a2be in stonith_command (client=0x1870a80, id=<value > optimized out>, flags=<value optimized out>, request=0x1884840, remote=0x0) > at commands.c:1380 > #5 0x0000000000403252 in st_ipc_dispatch (c=0x18838d0, data=<value > optimized out>, size=329) at main.c:142 > #6 0x00007f4aebaf8d64 in ?? () from /usr/lib64/libqb.so.0 > #7 0x00007f4aebaf908e in qb_ipcs_dispatch_connection_request () from > /usr/lib64/libqb.so.0 > #8 0x00007f4aee26fda5 in gio_read_socket (gio=<value optimized out>, > condition=G_IO_IN, data=0x18732f0) at mainloop.c:353 > #9 0x00007f4aebf8ef0e in g_main_context_dispatch () from > /lib64/libglib-2.0.so.0 > #10 0x00007f4aebf92938 in ?? () from /lib64/libglib-2.0.so.0 > #11 0x00007f4aebf92d55 in g_main_loop_run () from /lib64/libglib-2.0.so.0 > #12 0x0000000000403a98 in main (argc=<value optimized out>, > argv=0x7fffa3443148) at main.c:890 > (gdb) bt full > #0 0x00007f4aec6cdb51 in __strlen_sse2 () from /lib64/libc.so.6 > No symbol table info available. > #1 0x00007f4aec6cd866 in strdup () from /lib64/libc.so.6 > No symbol table info available. > #2 0x000000000040c6f6 in create_remote_stonith_op (client=0x1871120 > "2194f1b8-5722-49c3-bed1-c8fecc78ca02", request=0x1884840, peer=<value > optimized out>) > at remote.c:313 > nodeid = <value optimized out> > node = 0x1871790 > op = 0x187e2e0 > dev = <value optimized out> > __func__ = "create_remote_stonith_op" > __PRETTY_FUNCTION__ = "create_remote_stonith_op" > #3 0x000000000040cf40 in initiate_remote_stonith_op (client=<value > optimized out>, request=0x1884840, manual_ack=0) at remote.c:336 > query = 0x0 > client_id = 0x1871120 "2194f1b8-5722-49c3-bed1-c8fecc78ca02" > op = 0x0 > __func__ = "initiate_remote_stonith_op" > __PRETTY_FUNCTION__ = "initiate_remote_stonith_op" > #4 0x000000000040a2be in stonith_command (client=0x1870a80, id=<value > optimized out>, flags=<value optimized out>, request=0x1884840, remote=0x0) > at commands.c:1380 > alternate_host = <value optimized out> > dev = <value optimized out> > target = 0x1883f40 "1074005258" > call_options = 4610 > rc = -95 > is_reply = 0 > always_reply = 0 > reply = 0x0 > data = 0x0 > op = 0x187e550 "st_fence" > client_id = 0x1874cb0 "2194f1b8-5722-49c3-bed1-c8fecc78ca02" > __func__ = "stonith_command" > __PRETTY_FUNCTION__ = "stonith_command" > __FUNCTION__ = "stonith_command" > #5 0x0000000000403252 in st_ipc_dispatch (c=0x18838d0, data=<value > optimized out>, size=329) at main.c:142 > id = 4 > flags = 1 > request = 0x1884840 > client = 0x1870a80 > __FUNCTION__ = "st_ipc_dispatch" > __func__ = "st_ipc_dispatch" > __PRETTY_FUNCTION__ = "st_ipc_dispatch" > #6 0x00007f4aebaf8d64 in ?? () from /usr/lib64/libqb.so.0 > No symbol table info available. > #7 0x00007f4aebaf908e in qb_ipcs_dispatch_connection_request () from > /usr/lib64/libqb.so.0 > No symbol table info available. > #8 0x00007f4aee26fda5 in gio_read_socket (gio=<value optimized out>, > condition=G_IO_IN, data=0x18732f0) at mainloop.c:353 > adaptor = 0x18732f0 > fd = 15 > __func__ = "gio_read_socket" > #9 0x00007f4aebf8ef0e in g_main_context_dispatch () from > /lib64/libglib-2.0.so.0 > No symbol table info available. > #10 0x00007f4aebf92938 in ?? () from /lib64/libglib-2.0.so.0 > No symbol table info available. > #11 0x00007f4aebf92d55 in g_main_loop_run () from /lib64/libglib-2.0.so.0 > No symbol table info available. > #12 0x0000000000403a98 in main (argc=<value optimized out>, > argv=0x7fffa3443148) at main.c:890 > flag = <value optimized out> > lpc = 0 > ---Type <return> to continue, or q <return> to quit--- > argerr = 0 > option_index = 0 > cluster = {uuid = 0x176da90 "1090782474", uname = 0x176dac0 > "vd01-b", nodeid = 1090782474, cs_dispatch = 0x404050 > <stonith_peer_ais_callback>, > destroy = 0x404230 <stonith_peer_ais_destroy>} > actions = {0x40e3fb "reboot", 0x40e402 "off", 0x40ea75 "list", > 0x40e406 "monitor", 0x40e40e "status"} > __func__ = "main" > > > > Second is (segfault in CRM_ASSERT()): > ... > Core was generated by `/usr/libexec/pacemaker/stonithd'. > Program terminated with signal 11, Segmentation fault. > #0 stonith_command (client=0x0, id=0, flags=0, request=0xb342f0, > remote=0xb39cf0 "vd01-d") at commands.c:1258 > 1258 commands.c: No such file or directory. > in commands.c > ... > (gdb) bt > #0 stonith_command (client=0x0, id=0, flags=0, request=0xb342f0, > remote=0xb39cf0 "vd01-d") at commands.c:1258 > #1 0x00000000004040e4 in stonith_peer_callback (kind=<value optimized > out>, from=<value optimized out>, > data=0x7fffa5327cc8 "<st-reply > st_origin=\"stonith_construct_async_reply\" t=\"stonith-ng\" > st_op=\"st_notify\" st_device_id=\"manual_ack\" > st_remote_op=\"25917710-8972-4c40-b783-9648749396a4\" > st_clientid=\"936ea671-61ba-4258-8e12-"...) at main.c:218 > #2 stonith_peer_ais_callback (kind=<value optimized out>, from=<value > optimized out>, > data=0x7fffa5327cc8 "<st-reply > st_origin=\"stonith_construct_async_reply\" t=\"stonith-ng\" > st_op=\"st_notify\" st_device_id=\"manual_ack\" > st_remote_op=\"25917710-8972-4c40-b783-9648749396a4\" > st_clientid=\"936ea671-61ba-4258-8e12-"...) at main.c:254 > #3 0x00007f92ded376ca in ais_dispatch_message (handle=<value optimized > out>, groupName=<value optimized out>, nodeid=<value optimized out>, > pid=<value optimized out>, msg=0x7fffa5327a78, msg_len=<value > optimized out>) at corosync.c:551 > #4 pcmk_cpg_deliver (handle=<value optimized out>, groupName=<value > optimized out>, nodeid=<value optimized out>, pid=<value optimized out>, > msg=0x7fffa5327a78, msg_len=<value optimized out>) at corosync.c:619 > #5 0x00007f92de91ceaf in cpg_dispatch (handle=7749363892505018368, > dispatch_types=<value optimized out>) at cpg.c:412 > #6 0x00007f92ded34a42 in pcmk_cpg_dispatch (user_data=<value optimized > out>) at corosync.c:577 > #7 0x00007f92def61d27 in mainloop_gio_callback (gio=<value optimized > out>, condition=G_IO_IN, data=0xb2d400) at mainloop.c:535 > #8 0x00007f92dcc7ff0e in g_main_context_dispatch () from > /lib64/libglib-2.0.so.0 > #9 0x00007f92dcc83938 in ?? () from /lib64/libglib-2.0.so.0 > #10 0x00007f92dcc83d55 in g_main_loop_run () from /lib64/libglib-2.0.so.0 > #11 0x0000000000403a98 in main (argc=<value optimized out>, > argv=0x7fffa5427de8) at main.c:890 > (gdb) bt full > #0 stonith_command (client=0x0, id=0, flags=0, request=0xb342f0, > remote=0xb39cf0 "vd01-d") at commands.c:1258 > call_options = 4104 > rc = -95 > is_reply = 1 > always_reply = 0 > reply = 0x0 > data = 0x0 > op = 0xb34370 "st_notify" > client_id = 0xb3ddc0 "936ea671-61ba-4258-8e12-98542a541b23" > __func__ = "stonith_command" > __PRETTY_FUNCTION__ = "stonith_command" > __FUNCTION__ = "stonith_command" > #1 0x00000000004040e4 in stonith_peer_callback (kind=<value optimized > out>, from=<value optimized out>, > data=0x7fffa5327cc8 "<st-reply > st_origin=\"stonith_construct_async_reply\" t=\"stonith-ng\" > st_op=\"st_notify\" st_device_id=\"manual_ack\" > st_remote_op=\"25917710-8972-4c40-b783-9648749396a4\" > st_clientid=\"936ea671-61ba-4258-8e12-"...) at main.c:218 > remote = 0xb39cf0 "vd01-d" > #2 stonith_peer_ais_callback (kind=<value optimized out>, from=<value > optimized out>, > data=0x7fffa5327cc8 "<st-reply > st_origin=\"stonith_construct_async_reply\" t=\"stonith-ng\" > st_op=\"st_notify\" st_device_id=\"manual_ack\" > st_remote_op=\"25917710-8972-4c40-b783-9648749396a4\" > st_clientid=\"936ea671-61ba-4258-8e12-"...) at main.c:254 > xml = 0xb342f0 > __func__ = "stonith_peer_ais_callback" > #3 0x00007f92ded376ca in ais_dispatch_message (handle=<value optimized > out>, groupName=<value optimized out>, nodeid=<value optimized out>, > pid=<value optimized out>, msg=0x7fffa5327a78, msg_len=<value > optimized out>) at corosync.c:551 > data = 0x7fffa5327cc8 "<st-reply > st_origin=\"stonith_construct_async_reply\" t=\"stonith-ng\" > st_op=\"st_notify\" st_device_id=\"manual_ack\" > st_remote_op=\"25917710-8972-4c40-b783-9648749396a4\" > st_clientid=\"936ea671-61ba-4258-8e12-"... > uncompressed = 0x0 > xml = 0x0 > #4 pcmk_cpg_deliver (handle=<value optimized out>, groupName=<value > optimized out>, nodeid=<value optimized out>, pid=<value optimized out>, > msg=0x7fffa5327a78, msg_len=<value optimized out>) at corosync.c:619 > ais_msg = 0x7fffa5327a78 > __func__ = "pcmk_cpg_deliver" > #5 0x00007f92de91ceaf in cpg_dispatch (handle=7749363892505018368, > dispatch_types=<value optimized out>) at cpg.c:412 > timeout = 0 > error = <value optimized out> > cpg_inst = 0xb2cd90 > res_cpg_confchg_callback = <value optimized out> > res_cpg_deliver_callback = 0x7fffa53279c0 > res_cpg_totem_confchg_callback = <value optimized out> > cpg_inst_copy = {c = 0xb2cdf0, finalize = 0, context = 0x0, > {model_data = {model = CPG_MODEL_V1}, model_v1_data = {model = > CPG_MODEL_V1, > cpg_deliver_fn = 0x7f92ded37300 <pcmk_cpg_deliver>, > cpg_confchg_fn = 0x7f92ded33fb0 <pcmk_cpg_membership>, > cpg_totem_confchg_fn = 0, > flags = 0}}, iteration_list_head = {next = 0xb2cdd0, prev > = 0xb2cdd0}} > dispatch_data = 0x7fffa53279c0 > member_list = {{nodeid = 1090782474, pid = 4965, reason = 0}, > {nodeid = 1107559690, pid = 3544, reason = 0}, {nodeid = 1124336906, pid > = 4487, > reason = 3544}, {nodeid = 0, pid = 0, reason = 0} <repeats > 125 times>} > left_list = {{nodeid = 0, pid = 0, reason = 0} <repeats 128 times>} > joined_list = {{nodeid = 1107559690, pid = 3544, reason = 1}, > {nodeid = 0, pid = 0, reason = 0} <repeats 127 times>} > group_name = {length = 11, value = "stonith-ng", '\000' <repeats > 117 times>} > left_list_start = <value optimized out> > joined_list_start = <value optimized out> > i = <value optimized out> > ring_id = {nodeid = 0, seq = 0} > totem_member_list = {0 <repeats 128 times>} > errno_res = <value optimized out> > dispatch_buf = > > "\005\000\000\000\000\000\000\000W\004\000\000\000\000\000\000\240\224\327)\204\177\000\000\v\000\000\000\000\000\000\000stonith-ng", > '\000' <repeats 118 times>"\237, > > \003\000\000\000\000\000\000\n\005\004C\000\000\000\000\207\021\000\000\204\177\000\000\000\000\000\000\000\000\000\000\237\003\000\000\000\000\000\000\001\000\000\000\000\000\000\000\001", > '\000' <repeats 19 times>, "\t", '\000' <repeats 263 times>, > > "\n\005\004C\207\021\000\000\000\000\000\000\t\000\000\000\006\000\000\000vd01-d", > '\000' <repeats 250 times>, "O\001\000\000\000\000\000\000<st-reply > st_origin=\"stonith_construct_asyn"... > #6 0x00007f92ded34a42 in pcmk_cpg_dispatch (user_data=<value optimized > out>) at corosync.c:577 > rc = 0 > ---Type <return> to continue, or q <return> to quit--- > __func__ = "pcmk_cpg_dispatch" > #7 0x00007f92def61d27 in mainloop_gio_callback (gio=<value optimized > out>, condition=G_IO_IN, data=0xb2d400) at mainloop.c:535 > keep = 1 > client = 0xb2d400 > __func__ = "mainloop_gio_callback" > #8 0x00007f92dcc7ff0e in g_main_context_dispatch () from > /lib64/libglib-2.0.so.0 > No symbol table info available. > #9 0x00007f92dcc83938 in ?? () from /lib64/libglib-2.0.so.0 > No symbol table info available. > #10 0x00007f92dcc83d55 in g_main_loop_run () from /lib64/libglib-2.0.so.0 > No symbol table info available. > #11 0x0000000000403a98 in main (argc=<value optimized out>, > argv=0x7fffa5427de8) at main.c:890 > flag = <value optimized out> > lpc = 0 > argerr = 0 > option_index = 0 > cluster = {uuid = 0xb2da90 "1090782474", uname = 0xb2dac0 > "vd01-b", nodeid = 1090782474, cs_dispatch = 0x404050 > <stonith_peer_ais_callback>, > destroy = 0x404230 <stonith_peer_ais_destroy>} > actions = {0x40e3fb "reboot", 0x40e402 "off", 0x40ea75 "list", > 0x40e406 "monitor", 0x40e40e "status"} > __func__ = "main" > > Best, > Vladislav > > > > ------------------------------ > > _______________________________________________ > Pacemaker mailing list > Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > End of Pacemaker Digest, Vol 58, Issue 3 > **************************************** >
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org