Hi Florian,
I pushed the latest code to LP, the agent use notification now.
Also, most of the start/stop of resource have been removed. In my
opinion, the existing agent would need a major rewrite to support the
required logic. I think indeed it will a good idea to sit and talk at
PLUK about it. Maybe Pacemaker cannot be used but that would be sad.
Regards,
Yves
On 11-10-12 12:59 PM, Florian Haas wrote:
Hi again,
On 2011-10-12 18:23, Yves Trudeau wrote:
Hi,
following my previous post to the wrong list, forwarded to the
Pacemaker list by Florian, here is the my complete cluster configuration:
http://pastebin.com/zDj0MF1Z
Yikes. I just had a look at that resource agent (operating under the
assumption that the version at
http://bazaar.launchpad.net/~y-trudeau/percona-prm/alpha/view/head:/percona-prm/MySQL_replication
is still current), and that looks like you guys never once looked at the
OCF RA dev guide. What's the reason for rolling your own when you could
have contributed to the existing mysql RA that does support replication?
Or fixed it, in case you think it's broken or buggy?
Let's please go over this this in London. I dare say that if this RA
actually works, then it works pretty much only by accident. Again, all
of this is assuming we're talking about the version that's in Launchpad;
you may have produced your CIB dump on a box that uses an updated version.
Just to recall the original message:
I started to have issues with crm_master with Pacemaker 1.0.11. I
think I traced it down to the following problem. I know crm_master is
supposed to be called within the resource script, calling manually helps
to illustrate the problem.
root@testvirtbox1:~# /usr/sbin/crm_master -l reboot -v 1000 -r
p_MySQL_replication:0
root@testvirtbox1:~# /usr/local/sbin/crm_master -r
'p_MySQL_Replication:0' -G
name=master-p_MySQL_Replication:0 value=(null)
Error performing operation: cib object missing
and in daemon.log:
Oct 11 12:17:41 testvirtbox1 crm_attribute: [21986]: info: Invoked:
crm_attribute -N testvirtbox1 -n master-p_MySQL_Replication:0 -G
Oct 11 12:17:41 testvirtbox1 crm_attribute: [21986]: ERROR: crm_abort:
read_attr: Triggered assert at cib_attrs.c:297 : section != NULL
Er, sorry. I don't think anyone on this list is going to be willing to
troubleshoot a problem that involves a resource agent that doesn't use
notify where it should, invokes "crm resource start" for a different
resource from within the RA, asks the cluster manager about its role
when it should be telling it, etc. Let's not waste developer time. So
please, either show us an updated version of the RA that's fixed, or
let's talk about this in London in person, week after next.
I am not contesting that you may have actually found a Pacemaker
problem, but in order to be sure we'd have to start from a setup that's
_expected_ to work. Can you reproduce this issue with a different
Master/Slave RA, say the "Stateful" agent?
Cheers,
Florian
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker