Hello, After upgrading to pacemaker 1.1.6, cluster-glue 1.0.8 on Debian, our working apcmastersnmp resources stopped to work:
Feb 29 14:22:03 atlas0 stonith: [35438]: ERROR: apcmastersnmp device not accessible. Feb 29 14:22:03 atlas0 stonith-ng: [32972]: notice: log_operation: Operation 'monitor' [35404] for device 'stonith-atlas6' returned: -2 Feb 29 14:22:03 atlas0 stonith-ng: [32972]: ERROR: log_operation: stonith-atlas6: Performing: stonith -t apcmastersnmp -S 161 Feb 29 14:22:03 atlas0 stonith-ng: [32972]: ERROR: log_operation: stonith-atlas6: Invalid config info for apcmastersnmp device Please note the strange "161" argument of stonith. After checking the source code and stracing stonithd, as far as I see, the following happens: - stonithd calls fence_legacy, which steals the "port=161" parameter from apcmastersnmp. This produces the error message "Invalid config info for apcmastersnmp device" - At stealing "port=161", fence_legacy sets the port value to the node name and passes to stonith, even in status mode. Therefore we get "stonith -t apcmastersnmp -S 161" - However stonith cannot catch the invalid node parameter: if (!(argcount == 1 || (argcount < 1 && (status||listhosts||listtypes||listparanames||metadata)))) { ++errors; } and even in status mode wants to run the reset request too: if (status) { < no exit > } if (listhosts) { < no exit > } if (optind < argc) { ... rc = stonith_req_reset(s, reset_type, nodename); } Fortunately the port value does not match nodename, so it won't kill any node, but the agent fails. Am I on the right track? Would the following patch fix the issue? I'm asking it, because I don't know why "port=" is handled separatedly and what are the implications of deleting $opt_n below. --- fence_legacy.orig 2012-02-29 23:03:36.594945717 +0100 +++ fence_legacy 2012-03-01 14:41:46.454859212 +0100 @@ -105,6 +105,7 @@ elsif ($name eq "port" ) { $opt_n = $val; + $ENV{$name} = $val; } elsif ($name eq "stonith" ) { @@ -176,8 +177,8 @@ } elsif ( $opt_o eq "monitor" || $opt_o eq "stat" || $opt_o eq "status" ) { - print "Performing: $opt_s -t $opt_t -S $opt_n\n" unless defined $opt_q; - exec "$opt_s -t $opt_t $extra_args -S $opt_n" or die "failed to exec \"$opt_s\"\n"; + print "Performing: $opt_s -t $opt_t -S\n" unless defined $opt_q; + exec "$opt_s -t $opt_t $extra_args -S" or die "failed to exec \"$opt_s\"\n"; } else { Best regards, Jozsef -- E-mail : kadlecsik.joz...@wigner.mta.hu PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt Address: Wigner Research Centre for Physics, Hungarian Academy of Sciences H-1525 Budapest 114, POB. 49, Hungary _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org