Hi Andrew,

Thank you for comment.

> I assume this is for the stonith-enabled=true case, since offline
> nodes are ignored for stonith-enabled=false.
> Once the node is shot, then its status section is erased and no failed
> actions will be shown... so why do we need this patch?

I know that trouble information disappears when I succeeded in shooting a node.
In addition, in the case of stonith-enabled=false, I know that it is not 
displayed if a node becomes
the offline.

(snip)
                if(this_node->details->online || is_set(data_set->flags, 
pe_flag_stonith_enabled)) {
                        /* offline nodes run no resources...
                         * unless stonith is enabled in which case we need to
                         *   make sure rsc start events happen after the stonith
                         */
                        crm_debug_3("Processing lrm resource entries");
                        unpack_lrm_resources(this_node, lrm_rsc, data_set);
                }
                );
(snip)

But, the failed action information is displayed in crm_mon though a node is 
shutdown when it is not
necessary to shoot a node.
(The failed count of times disappears then, but the failed action stays.)

 # srv01 was monitor error.

Migration summary: 
* Node srv04:  
* Node srv02:  
* Node srv01:  
   prmApPostgreSQLDB1: migration-threshold=1 fail-count=1 
* Node srv03:  
 
Failed actions: 
    prmApPostgreSQLDB1_monitor_10000 (node=srv01, call=81, rc=7, 
status=complete): not running

 # Next....srv01 was service stop.

Migration summary: ---> The failed count of srv01 disappears
* Node srv04:  
* Node srv02:  
* Node srv03:  
 
Failed actions: ---> The failed action stays
    prmApPostgreSQLDB1_monitor_10000 (node=srv01, call=81, rc=7, 
status=complete): not running

Our user does not expect the trouble information of the node that stopped 
normally.

In the case of stonith-enabled=true, should the node that trouble happened 
display failed action
information till it is shot?
When the trouble information of the node that stopped normally is displayed, is 
not the user confused?

Best Regards,
Hideo Yamauchi.

--- Andrew Beekhof <and...@beekhof.net> wrote:

> 2010/9/13  <renayama19661...@ybb.ne.jp>:
> > Hi,
> >
> > I contribute the patch of the crm_mon command.
> >
> > A node was offline and, in the case of the shutdown, revised it not to 
> > display a trouble
> action.
> >
> > Please confirm a patch.
> > And, without a problem, please take this revision in a development version.
> 
> Hmmm.
> I'm not sure about this patch.
> 
> I assume this is for the stonith-enabled=true case, since offline
> nodes are ignored for stonith-enabled=false.
> Once the node is shot, then its status section is erased and no failed
> actions will be shown... so why do we need this patch?
> 
> >
> >
> > diff -r 9b95463fde99 tools/crm_mon.c
> > --- a/tools/crm_mon.c &#65533; Mon Sep 13 13:07:16 2010 +0900
> > +++ b/tools/crm_mon.c &#65533; Mon Sep 13 13:07:59 2010 +0900
> > @@ -829,6 +829,7 @@
> > &#65533; &#65533; int configured_resources = 0;
> > &#65533; &#65533; int print_opts = pe_print_ncurses;
> > &#65533; &#65533; const char *quorum_votes = "unknown";
> > + &#65533; &#65533;gboolean is_failed_first_disp = TRUE;
> >
> > &#65533; &#65533; if(as_console) {
> > &#65533; &#65533; &#65533; &#65533;blank_screen();
> > @@ -989,16 +990,28 @@
> > &#65533; &#65533; }
> >
> > &#65533; &#65533; if(xml_has_children(data_set->failed)) {
> > - &#65533; &#65533; &#65533; print_as("\nFailed actions:\n");
> > &#65533; &#65533; &#65533; &#65533;xml_child_iter(data_set->failed, xml_op,
> > &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533; &#65533;
int val = 0;
> > + &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533;
&#65533;node_t *failed_node = NULL;
> > &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533; &#65533;
const char *id = ID(xml_op);
> > &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533; &#65533;
const char *last = crm_element_value(xml_op, "last_run");
> > &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533; &#65533;
const char *node = crm_element_value(xml_op, XML_ATTR_UNAME);
> > &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533; &#65533;
const char *call = crm_element_value(xml_op, XML_LRM_ATTR_CALLID);
> > &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533; &#65533;
const char *rc &#65533; = crm_element_value(xml_op, XML_LRM_ATTR_RC);
> > &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533; &#65533;
const char *status = crm_element_value(xml_op, XML_LRM_ATTR_OPSTATUS);
> > -
> > +
> > + &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533;
&#65533;failed_node = pe_find_node(data_set->nodes, node);
> > + &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533;
&#65533; if (failed_node != NULL) {
> > + &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533;
&#65533; &#65533; &#65533;if ((failed_node->details->shutdown == TRUE) &&
> (failed_node->details->online ==
> > FALSE)) {
> > + &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533;
&#65533; &#65533; &#65533; &#65533; &#65533;continue;
> > + &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533;
&#65533; &#65533; &#65533;}
> > + &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533;
&#65533;}
> > +
> > + &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533;
&#65533;if (is_failed_first_disp){
> > + &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533;
&#65533; &#65533; &#65533;is_failed_first_disp = FALSE;
> > + &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533;
&#65533; &#65533; &#65533;print_as("\nFailed actions:\n");
> > + &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533;
&#65533;}
> > +
> > &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533; &#65533;
val = crm_parse_int(status, "0");
> > &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533; &#65533;
print_as(" &#65533; &#65533;%s (node=%s, call=%s, rc=%s, status=%s",
> > &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; &#65533; 
> > &#65533; &#65533; &#65533;
&#65533; &#65533; &#65533; &#65533; &#65533;id, node, call, rc, 
op_status2text(val));
> >
> >
> >
> > Best Regards,
> > Hideo Yamauchi.
> >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: 
> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> >
> >
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> 


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Reply via email to