This is a continuation of trying to get ldirectord working under
pacemaker. I have a working installation of ldirectord. I know this
because if I manually configure the eth0:0 pseudo-interface with the
virtual server address, and manually start ldirectord with

# /usr/sbin/ldirectord /etc/ha.d/ldirectord.cf start

...then everything works. I can connect to the virtual service address
and port, and I get properly redirected to one of the real servers.
ipvsadm shows normal output. All looks good.

However, if I try to start the ldirectord resource, it starts, then
fails, then starts, then fails, etc. This will continue until I issue a
"resource ldirectord stop" command in the CRM shell. 

So it has to be something with how I configured it, but I'm damned if I
can figure it out. Here is what I have that involves this resource:

primitive ldirectord ocf:heartbeat:ldirectord \
        op start interval="20" timeout="15" \
        op stop interval="20" timeout="15" \
        op monitor interval="20" timeout="20" \
colocation vdir-ipi-with-ldirectord inf: vdir-ipi ldirectord
order vdir-ipi-before-ldirectord inf: vdir-ipi ldirectord

The vdir-ipi is an IPAddr resource that will start fine and results in
the eth0:0 alias interface being configured and brought up.

When I issue a "resource start ldirectord" command from the crm shell,
what I get from lrmd is repeats of this sequence:

Oct 28 18:12:24 vmx1.ucar.edu lrmd: [4842]: info: rsc:vdir-ipi:5464:
start
Oct 28 18:12:24 vmx1.ucar.edu lrmd: [4842]: info: Managed vdir-ipi:start
process 4923 exited with return code 0.

Oct 28 18:12:25 vmx1.ucar.edu lrmd: [4842]: info: rsc:ldirectord:5466:
start
Oct 28 18:12:25 vmx1.ucar.edu lrmd: [4842]: info: RA output:
(ldirectord:start:stdout) /usr/sbin/ldirectord /etc/ha.d/ldirectord.cf
start
Oct 28 18:12:26 vmx1.ucar.edu lrmd: [4842]: info: Managed
ldirectord:start process 5103 exited with return code 0.
Oct 28 18:12:27 vmx1.ucar.edu lrmd: [4842]: info: rsc:ldirectord:5467:
start
Oct 28 18:12:27 vmx1.ucar.edu lrmd: [4842]: info: perform_op:2906:
operation start[5467] on ocf::ldirectord::ldirectord for client 4845,
its parameters: CRM_meta_interval=[20000] CRM_meta_timeout=[15000]
crm_feature_set=[3.0.1] CRM_meta_name=[start]  for rsc is already
running.
Oct 28 18:12:27 vmx1.ucar.edu lrmd: [4842]: info: perform_op:2916:
postponing all ops on resource ldirectord by 1000 ms
Oct 28 18:12:27 vmx1.ucar.edu lrmd: [4842]: info: perform_op:2906:
operation start[5467] on ocf::ldirectord::ldirectord for client 4845,
its parameters: CRM_meta_interval=[20000] CRM_meta_timeout=[15000]
crm_feature_set=[3.0.1] CRM_meta_name=[start]  for rsc is already
running.
Oct 28 18:12:27 vmx1.ucar.edu lrmd: [4842]: info: perform_op:2910:
operations on resource ldirectord already delayed
Oct 28 18:12:27 vmx1.ucar.edu lrmd: [4842]: info: Managed
ldirectord:start process 5221 exited with return code 0.
Oct 28 18:12:27 vmx1.ucar.edu lrmd: [4842]: info: rsc:ldirectord:5468:
stop
Oct 28 18:12:27 vmx1.ucar.edu lrmd: [4842]: info: Managed
ldirectord:stop process 5226 exited with return code 0.
Oct 28 18:12:28 vmx1.ucar.edu lrmd: [4842]: WARN: Managed
ldirectord:monitor process 5265 exited with return code 7.
Oct 28 18:12:29 vmx1.ucar.edu lrmd: [4842]: info: cancel_op: operation
monitor[5469] on ocf::ldirectord::ldirectord for client 4845, its
parameters: CRM_meta_interval=[20000] CRM_meta_timeout=[20000]
crm_feature_set=[3.0.1] CRM_meta_name=[monitor]  cancelled
Oct 28 18:12:29 vmx1.ucar.edu lrmd: [4842]: info: rsc:ldirectord:5470:
stop
Oct 28 18:12:29 vmx1.ucar.edu lrmd: [4842]: info: Managed
ldirectord:stop process 5296 exited with return code 0.

And then it repeats:

Oct 28 18:12:31 vmx1.ucar.edu lrmd: [4842]: info: rsc:ldirectord:5471:
start

etc.

How can I figure out what I have done wrong here?

Thanks,
--Greg



_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to