Re: [Pacemaker] questions about the booth

Yuusuke Iida Tue, 28 May 2013 23:45:41 -0700

Hi, Jiaju

Thank you for merging it!


Thanks,
Yusuke

(2013/05/29 14:52), Jiaju Zhang wrote:

Hi Yuusuke,

Merged, thanks!

Regards,
Jiaju

On Mon, 2013-05-27 at 16:26 +0900, Yuusuke Iida wrote:

Hi, Jiaju

I made the daemon who supervises the resource depending on a ticket, in
order to solve this problem.

I have sent the following "pull request".
https://github.com/jjzhang/booth/pull/52

The feature is as follows.
  - The information on the ticket to supervise is acquired from the
configuration file of booth.
  - A ticket becomes "grant", and if a resource start(s), surveillance
will start.
  - booth_resource_monitord moves a ticket to other sites using booth,
when it becomes impossible for a resource to work in a site.
  - booth_resource_monitord will be installed when the configure option
was with the "--enable-resource-monitor".

How to use:
Usually, booth_resource_monitord is added to the composition which is
using booth as follows.
===================================================================
group grpBooth prmIpBooth prmApBooth prmApBooth_rsc_mond
primitive prmIpBooth ocf:heartbeat:IPaddr2 \
         params ip="***.***.***.***" nic="eth*" cidr_netmask="24" \
         op start interval="0s" timeout="60s" on-fail="restart" \
         op monitor interval="10s" timeout="60s" on-fail="restart" \
         op stop interval="0s" timeout="60s" on-fail="fence"
primitive prmApBooth ocf:pacemaker:booth-site \
         op start interval="0s" timeout="90s" on-fail="restart" \
         op monitor interval="10s" timeout="60s" on-fail="restart" \
         op stop interval="0s" timeout="100s" on-fail="fence"
primitive prmApBooth_rsc_mond ocf:heartbeat:anything \
         params binfile="booth_resource_monitord" \
         op start interval="0s" timeout="90s" on-fail="restart" \
         op monitor interval="10s" timeout="60s" on-fail="restart" \
         op stop interval="0s" timeout="100s" on-fail="fence"
--------------------------------------------------------------------

limitation:
The target resource cannot be read when "rsc_ticket" is described by
"resource_set".

I want me to merge this function into the sauce tree of booth by all means.

Best Regards,
Yusuke


(2012/03/08 11:37), Yuusuke Iida wrote:

Hi, Jiaju

Thank you for reply.

(2012/03/05 14:00), Jiaju Zhang wrote:

Hi Yuusuke,

On Mon, 2012-03-05 at 11:49 +0900, Yuusuke Iida wrote:

Hi, Jiaju

I thought about a plan to deal when a resource did not change in sites.
I think that I make daemon working outside booth.

This daemon watches it whether a resource can work in sites.
And it executes revoke command for booth when the state that a resource
cannot manage was confirmed.
booth catches revoke and thinks that I move a ticket to another site.


If I understand it correctly, the daemon you mentioned automated some of
the admin's behaviors, if the resources cannot be managed by one site,
revoke the ticket and move the ticket to another site. I have no
objection if the admin has this requirement;)

Thank you for agreeing.
The summary of the processing is just what you think.
admin may not necessarily need this function.
However, I think that admin which wants to automate processing as much
as possible exists.

The only thing I'm not sure is if the admin really want to do this? My
assumption is if the local site is alive the admin will be inclined to
keep the ticket stay in this site, if the site is totally down, we have
no choice, the ticket has to move to another site to keep the service
available.
However, that is just one using scenario in my mind, booth should
support the using scenario that you mentioned;)


I think that the continuity of the resource is kept in this movement.

I analyze CIB and intend to perform the state confirmation of the
resource using score.


I'm not quite understand here, do you mean that if the resource usually
being un-managed by this site, we'd better move it to another site, so
your daemon will depends on this value to decide whether it would move
the ticket another site, right?

When a resource failed, I think that the score of the resource becomes
less than 0.
When the resource was not able to start in all nodes in the site, I
think that score becomes less than 0 in all nodes.
I want to judge the state that a resource was not able to operate from
this score.

When a ticket does not become grant, the score of the resource becomes
less than 0.
Therefore, I want to monitor the resource while a ticket becomes grant.


Well, I think you raised another using scenario which I has not thought
of before;) And I agree with you to setup such a daemon to do this work
if the admin need.

I want you to confirm it again when you were completed.

Thanks,
Yuusuke


Thanks,
Jiaju


--
----------------------------------------
METRO SYSTEMS CO., LTD

Yuusuke Iida
Mail: iiday...@intellilink.co.jp
----------------------------------------

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] questions about the booth

Reply via email to